Cotygodniowa dawka linków, czyli archiwum newslettera Dane i Analizy
How Much More Time We Spent at Home
We had to do more from home. Here’s how much everything shifted by total minutes in a day. Tags: home , time use
A Comprehensive Guide to Image Processing: Using an OpenCV Tool
Image Processing Essentials An image processing tool with OpenCV, QT Creator, and C++ As the last part of this series on the Image Processing flow, I want to present a simple Image Processing Tool implemented using OpenCV 3.2.0 on QT Creator (5.12.10) with C++ to apply almost all the Image Processing operations discussed in the previous posts. To be able to use it, it’s enough to build OpenCV (…)
Image Processing Part 2
Image Processing Essentials A Comprehensive Guide to Image Processing: Part 2 From linear and non-linear spatial filtering to special kernels for smoothing, sharpening, noise removal, and edge detection Part 2.1 Spatial operations are performed directly on the pixels of a given image and we classify these operations in three categories. “ Spatial domain operations ” is another word you can come (…)
Automated Machine Learning Model Testing
Try more than 20 machine learning models with only a few lines of code using LazyPredict Image by Author We have all been in this situation that we didn’t know which model is optimum for our ML project and most likely we were trying and evaluating many ML models just to see their behavior in our data. However, this is not a simple task and requires time and effort. Fortunately, we can do this (…)
From pandas to PySpark
Leveraging your pandas data manipulation skills to learn PySpark Being able to skillfully and efficiently manipulate big data is a useful skill to have for data analysts, data scientists and anyone working with data. If you are already comfortable with Python and pandas, and want to learn to wrangle big data, a good way to start is to get familiar with PySpark, a Python API for Apache Spark, a (…)
Plotting a Waffle with Python
Why? Pie might be delicious, but Pie charts are horrible. Continue reading on Python in Plain English »
Building a Fast Interactive Dashboard in Jupyter through Gradio
Machine Learning A ready-to-run tutorial on Gradio, a very powerful Python package for Machine Learning demos. Some days ago, I discovered a very interesting Python package, named Gradio . According to its authors, Gradio permits to build demos for Machine Learning . The package is exploited by machine learning teams at Google, Facebook, and Amazon. Thus, I decided to study this package and build (…)
Hybrid Use of RDBMS and NoSQL for The Transcriptome Data Processing
This article was published as a part of the Data Science Blogathon The transcriptome sequencing (RNA-seq) method has become quite a routine method for studying model organisms as well as crops. As a result of bioinformatic processing of such experiments, volumetric heterogeneous data are obtained, represented by the nucleotide sequences of transcripts, amino acid sequences, and […] The post (…)
[R] Labelling area plots – Benjamin Nowak
Inserting the legend directly into a graph often makes it easier to read (see this post from Cedric Scherer on how to label barplots for example). But this can be more complicated to achieve for graphs that use cumulative numbers. Here we will see a quick example on how to annotate directly an area plot. For this example, we will use the data of week 33 (year 2021) of Tidy Tuesday , about (…)
article extraction, doc2vec & health news coverage in online media · Jason Timm
Introduction This post demonstrates a simple procedure for extracting articles from online news sources using the quicknews package. We also demonstrate methods for entity extraction based on a controlled vocabulary (here, the MeSH thesaurus & hierarchically-organized vocabulary), as well as a quick implementation of a doc2vec model. Gather article metadata While primarily an article (…)
Optimize Memory Tips in Python
Tracking, managing, and optimizing memory usage in Python is a well-understood matter but lacks a comprehensive summary of methods. This… Continue reading on Towards Data Science »
Gantt charts with Python’s Matplotlib
A guide to visualizing project schedules with Python Image by the author With more than 100 years of history, this visualization continues to be very useful for project management. Henry Gantt initially created the graph for analyzing completed projects. More specifically, he designed this visualization to measure productivity and identify underperforming employees. Through the years, it became a (…)
How to Deploy Interactive Pyvis Network Graphs on Streamlit
Publish your beautiful network graphs online with Python and Streamlit for the world to visualize and interact with Image by author Visualization of network graphs helps us better understand complex relationships between multiple entities. Beyond static images, Python libraries such as Pyvis allow us to build highly interactive graphs for network visualization. Instead of letting these graphs sit (…)
Deduplikacja zdarzeń w Logstash i Redis
Deduplikacja to W systemach rozproszonych występują tylko dwa trudne problemy: 2. Dostarczenie wiadomości dokładnie raz1. Gwarantowana kolejność wiadomości2. Dostarczenie wiadomości dokładnie raz. Inny mi słowy: w tym wpisie zajmiemy się deduplikacją zdarzeń 😁. Ostatnio miałem z tym problem w SIEM’ie, więc padło na Logstash’a. Dlaczego jest to istotne? Jak pewnie wiesz, świat nie jest idealny. … (…)
Why I’m Using VSCode for Jupyter Notebooks
VSCode is a great Python editor and, as I accidentally discovered, good for Jupyter Notebooks, too Continue reading on Towards Data Science »
Zestawienie linków przygotowuje automat, wybacz więc wszelkie dziwactwa ;-)