Cotygodniowa dawka linków, czyli archiwum newslettera Dane i Analizy
3 Examples That Show The Unlimited Flexibility of PySpark
A combination of Python and SQL but easier than both Spark is an analytics engine used for large-scale data processing. It lets you spread both data and computations over clusters to achieve a substantial performance increase. It is easier than ever to collect, transfer, and store data. Hence, we (…)
Row-wise operations in R: compute row means in tidyverse
This tutorial shows how to perform row-wise operations in R using tidyverse. We will use three key functions, rowwise(), c_across() and rowMeans() to perform to perform row-wise operations on a dataframe. rowwise() and c_across() functions are from dplyr. rowwise() function is available in dplyr (…)
Streaming With Dataiku Using Kafka & More
In this article, we will go over some of the basic concepts of streaming in the context of a machine learning (ML) project and how you can use Dataiku to ingest or process streaming data.
Building your own knitr compile farm on your Raspberry Pi with {plumber}
Rage is my fuel I’ve had the {plumber} package on my radar for quite some time, but never tried it. However, a couple of weeks ago, I finally had a reason to try it out and see how the package works. One of my main problems in life is that my work laptop runs Windows, and my second problem is that (…)
FastAPI — Create and Deploy Hot Dog Detector
Computer Vision FastAPI — Create and Deploy Hot Dog Detector And learn to containerize it using Docker I don’t smoke, except for special occasion — Jian Yang For those of you familiar with the hit tv-series Silicon Valley , you would have guessed the inspiration for this article. Jimmy O. Yang’s (…)
Understand The DBSCAN Clustering Algorithm!
ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction In this article, I’m gonna explain about DBSCAN algorithm. It … The post Understand The DBSCAN Clustering Algorithm! appeared first on Analytics Vidhya .
3 Techniques to avoid Overfitting of Decision Trees
3 Techniques to Avoid Overfitting of Decision Trees Hands-on implementation of pre-pruning, post-pruning, and ensemble of Decision Trees Image by Pete Linforth from Pixabay Decision Trees are a non-parametric supervised machine learning approach for classification and regression tasks. Overfitting (…)
How to Set-up a cost-effective AWS EMR cluster and Jupyter Notebooks for SparkSQL
advanced data science skills How to Set-up a cost-effective AWS EMR cluster and Jupyter Notebooks for SparkSQL (Updated Nov. 2020) This is a living article based on my personal notes for work. The AWS interface and options have changed quite a bit over time so this document is constantly updated in (…)
Practical Guide for Visualizing CNNs Using Saliency Maps
A tutorial in deep learning interpretability with Python Photo By E berhard Grossgasteiger on Unsplash Deep Learning Interpretability While it’s convenient to solve complex problems by composing supermassive neural networks in computer vision, understanding the impact of each weight on the outcome (…)
AutoML: możliwości i wyzwania
Jednym z coraz częściej poruszanych tematów w obszarze AI jest AutoML, który według wielu działa lepiej niż człowiek. To stwierdzenie jest poniekąd prawdziwe, ale jest także bardzo mylące – w szczególności dla osób, które nie do końca rozumieją kontekst. To wszystko zależy od tego, kto to mówi i w (…)
ELO: a new rating system for wine
Exploring an alternative to the star rating system Whether it be Uber, Amazon or Google, star ratings are everywhere we look these days. For all their simplicity, they suffer from a range of problems. To name some: over-representation of extreme views in an average rating, different standards (…)
Zestawienie linków przygotowuje automat, wybacz więc wszelkie dziwactwa ;-)