Przejdź do treści

Newsletter Dane i Analizy, 2021-09-27

Cotygodniowa dawka linków, czyli archiwum newslettera Dane i Analizy

t-SNE Machine Learning Algorithm — A Great Tool for Dimensionality Reduction in Python
Machine Learning t-SNE Machine Learning Algorithm — A Great Tool for Dimensionality Reduction in Python How to use t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize high-dimensionality data? t-SNE visualization with different perplexities. Gif image by  author . Intro A successful data scientist understands a wide range of Machine Learning algorithms and can explain the results to (…)

Taking Keras and TensorFlow to the Next Level
11 tips and tricks to make the most out of Keras and TensorFlow Keras is a beautiful project. While TensorFlow and PyTorch used to compete for being the state-of-the-art, Keras aims at us , the professionals that need to get stuff done. We are okay with using last year’s models instead of betting on the next Transformer-killer or ResNet wannabes. Being Keras part of TensorFlow for nearly two (…)

All you need to know about Tuples in Python
Description First of all, let’s talk about Python Tuples in general. Tuples are used to store data of multiple types like str, int, float, boolean, etc in a single variable. Lists and tuples have a lot in common and lots of differences as well. First of all, in both data types, the elements are ordered, so the items have a defined order, that will not change. Also, both are dynamic in size data (…)

A Step-by-Step Guide in detecting causal relationships using Bayesian Structure Learning in Python.
The starters guide to effectively determine causality across variables.

How To Visualize Databases as Network Graphs in Python
Build a Dash web application to interactively explore database structures.

How to Install Apache Kafka Using Docker — The Easy Way
How to Install Apache Kafka Using Docker — The Easy Way And how to create your first Kafka Topic. Video guide available. In a world of big data, a reliable streaming platform is a must. That’s where Kafka comes in. And today, you’ll learn how to install it on your machine and create your first Kafka topic. Want to sit back and watch? I’ve got you covered: (…)

Guide for Data Visualization With Bokeh Python Library
This article was published as a part of the Data Science Blogathon Image 1  Introduction I am sure many of you have read several articles around the world stating the buzz around “Machine Learning, “Data Scientist”, “Data Visualization” and so on. Some have branded data science as the sexiest job of the 21st century. A report […] The post Guide for Data Visualization With Bokeh Python Library (…)

Mike Driscoll: Creating an MP3 Tagger GUI with wxPython
I don’t know about you, but I enjoy listening to music. As an avid music fan, I also like to rip my CDs to MP3 so I can listen to my music on the go a bit easier. There is still a lot of music that is unavailable to buy digitally. Unfortunately, when you rip a lot of music, you will sometimes end up with errors in the MP3 tags. Usually, there is a mis-spelling in a title or a track isn’t tagged (…)

Supervised Learning algorithms cheat-sheet
Complete cheat-sheet for all supervised machine learning algorithms you should know with pros, cons and hyperparameters The essence of supervised machine learning algorithm. Image by Author Contents This article provides cheat sheets for different supervised learning machine learning concepts and algorithms. This is not a tutorial, but it can help you to better understand the structure of machine (…)

How to Send Emails with Python
Python provides a couple of really nice modules that you can use to craft emails with. They are the email and smtplib modules. Instead of going over various methods in these two modules, you’ll spend some time learning how to actually use these modules. Specifically, you’ll be covering the following: The basics of emailing How … How to Send Emails with Python Read More » The post How to Send (…)

Kafka: The Definitive Guide v2
The long-awaited update to the immensely popular Kafka: The Definitive Guide. Confluent is happy to announce that we will be providing new early release chapters of Kafka: The Definitive Guide v2 as they become available until the completion of the new e-book by the end of 2021. Included in this early release preview is: Chapter 1: Meet Kafka Chapter 2: Installing Kafka Chapter 3: Kafka (…)

Fun with Markov Network Brains
An introduction for dummy programmers Over the course of about 1000 generations, a Markov Network Brain evolves dumb bugs (left) into something capable of finding food immediately (right). The “bugs” achieve this with no awareness of their environment beyond their physical bodies and antennae Click here for the demo (known to work with FF 68.0.1, not 100% cross-browser) A couple of months ago, I (…)

The New Generation Data Lake
The petabyte architecture you cannot afford to miss! Image by Hubert Neufeld: https://unsplash.com/photos/7S21XSxKxVk The volumes of data used for Machine Learning projects are relentlessly growing. Data scientists and data engineers have turned to Data Lakes to store vast volumes of data and find meaningful insights. Data Lake architectures have evolved over the years to massively scale to (…)

Najważniejsza rzecz w Machine Learning
Ile eksperymentów w tygodniu udaje Ci się przeprowadzić? W tym odcinku porozważamy o eksperymentowaniu i o tym, dlaczego odgrywa ono kluczową rolę w Machine Learning. Dodatkowo dowiesz się, jakie prowadzę własne eksperymenty, jak może Ci to pomóc. Mam też dla Ciebie pracę domową i przydatne punkty, które pomogą Ci zarządzać eksperymentami. Dzisiejszy odcinek zacznie się nieco filozoficznie, ale (…)

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework
Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The team transforms Uber’s ideas into agile, global solutions by designing and implementing scalable solutions. One of the areas of expertise within GSS is the Digitization vertical. The Digitization team efficiently converts physical signals (…)

A lightweight data validation ecosystem with R, GitHub, and Slack
Data quality monitoring is an essential part of any data analysis or business intelligence workflow. As such, an increasing number of promising tools 1 have emerged as part of the Modern Data Stack to offer better orchestration, testing, and reporting. Although I’m very excited about the developments in this space, I realize that emerging products may not be the best fit for every organization. (…)

Simulating Traffic Flow in Python
Although traffic doesn’t always flow smoothly, cars seamlessly crossing intersections and turning and stopping at traffic signals can look quite magnificent. This contemplation got me thinking of how important traffic flow is for human civilization. After this, the nerd inside of me couldn’t resist thinking of a way to simulate traffic flow. I spent a couple of weeks working on an undergraduate (…)

Process ~10M Row Datasets in Milliseconds In This Comprehensive Pandas Speed Guide
Use Pandas the way it was intended to… “Great… another article on how to make Pandas n times faster.” I think I have said that countless times for the past two years I have been using Pandas. The most recent one I saw said, “make Pandas 71,803 times faster”. But I won’t give you that kind of promise. I will just show you how to use Pandas in the fastest way possible. Because you can’t speed up (…)

9 Reasons Why You Should Start Using Python Dataclasses
Towards efficiency and less boilerplate code Image by the author Starting from version 3.7, Python has introduced dataclasses (see PEP 557 ), a new feature that defines classes that contain and encapsulate data. I recently started using this module in a couple of data science projects and I’m really enjoying it. Off the top of my head, I can think of two reasons: Less boilerplate code More (…)


Zestawienie linków przygotowuje automat, wybacz więc wszelkie dziwactwa ;-)

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *