Cotygodniowa dawka linków, czyli archiwum newslettera Dane i Analizy
The State Of Web Scraping in 2021 – Mihai’s Blog
Author: Mihai Avram | Date: 10/02/2021 The area of web scraping has really expanded in the last few years, and it helps to know some of the main frameworks, protocols, and etiquette so that you can build the next awesome Web Scraping tool to revolutionize our world! Or maybe just your local neighborhood, or workgroup – that’s fine too. In this post, we will cover. What is web scraping? What are (…)
11 Books Every Software Developer Should Read
Here is a list of some of the best books new software developers can learn from.
What is MDS? MDS with other distances , Field of usage of Multidimensional Scaling
Explaining and reproducing Multidimensional Scaling (MDS) using different distance approaches with python implementation Dimensionality reduction methods allow examining the dataset in another axis according to the relationship between various parameters such as correlation, distance, variance in datasets with many features. After this stage, operations such as classification are performed on the (…)
From Jupyter Notebook to Deployment — A Straightforward Example
From Jupyter Notebook to Deployment — A Straightforward Example A step-by-step example of taking typical machine learning research code and building a production-ready microservice. This article is intended to serve as a consolidated example of the journey I took in my work as a Data Scientist, beginning from a typical solved problem in Jupyter Notebook format and developing it into a deployed (…)
Word2vec with PyTorch: Implementing the Original Paper
Covering all the implementation details, skipping high-level overview. Code attached.
The „Frequently Bought Together” Recommendation System
A walkthrough in python using Apriori and FP Growth algorithms with the mlxtend library
Big Data Visualization Using Datashader in Python
How does Datashader work and why is it crazy fast?
Map Projection Playground
Adapted for my course „Cartographic and Geodetic Foundations for Planners”. You can find even more static examples map projections in this overview. Further settings: ⚠️ This causes rendering errors for some projections: For some interrupted projections, you can set the number of lobes: 1 The circles for visualizing distortions are taken from here. This is not actually a true implementation of (…)
Migrate From Flask to FastAPI Smoothly
Transition your Flask server for better performance and maintainability
How to create a Threat Detection Model using YOLOv3
This article was published as a part of the Data Science Blogathon Pre-requisites Knowledge of OpenCV is a must. Basic understanding of detection algorithm. Overview of Threat Detection Model We know that security is always a main concern in every area because of the rise in crime rates in crowded areas or in suspicious isolated areas. […] The post How to create a Threat Detection Model using (…)
Tips and Tricks to Train State-Of-The-Art NLP Models
This is the era of state-of-the-art transformer-based NLP models. With the introduction of packages like transformers by huggingface, it is very convenient to train NLP models for any given task. But how do you get an extra edge when everyone is doing the same? How to get that extra performance out of the model which […] The post Tips and Tricks to Train State-Of-The-Art NLP Models appeared first (…)
Building a realtime ticket booking solution with Kafka, FastAPI, and Ably
As the post-pandemic world emerges, the future of events such as summits, conferences or concerts is brighter than ever. Thanks to hybrid events , in-person events are now doubled by online happenings, which allows event organizers to reach much larger, geographically distributed audiences. For organizers and ticket distributors, providing a great ticket-booking experience to their global (…)
7 Ways to Make Your Python Project Structure More Elegant
Great projects start as a single file script and evolve into a community-maintained framework. But few projects make it to this level. Most, regardless of their usefulness to others, end up not being used by anyone. The critical factor that makes your project convenient (or miserable) for others is its structure. What is a perfect Python project structure that works well? Great projects are (…)
Apache Spark: Bucketing and Partitioning. Scala
Overview of how you can accomplish more by Worrying less about infrOverview of partitioning and bucketing strategy to maximize the benefits while minimizing adverse effects. if you can reduce the overhead of shuffling, need for serialization, and network traffic, then why not. in the end Performance, better cluster utilization, and cost-efficiency beat it all. If you are to make full use of a (…)
Real Plug-and-Play Supervised Learning AutoML using R and lares
Are you interested in guest posting? Publish at DataScience+ via your RStudio editor. Category Visualizing Data Tags AutoML Best R Packages lares Machine Learning R Programming The lares package has multiple families of functions to help the analyst or data scientist achieve quality robust analysis without the need of much coding. One of the most complex but valuable functions we have is (…)
Statistics in Python — Using Chi-Square for Feature Selection
Statistics in Python — Using Chi-Square for Feature Selection In my previous two articles, I talked about how to measure correlations between the various columns in your dataset and how to detect multicollinearity between them: Statistics in Python — Understanding Variance, Covariance, and Correlation Statistics in Python — Collinearity and Multicollinearity However, these techniques are useful (…)
I Built a News Classifier with 96% Accuracy Using CNN
Build a Web Application for News Classification with 96% Accuracy Using Flask and Keras Motivation : In this article i will explain how i built with Nadia Khan a Convolutional neural network with 96 % accuracy to classify news headlines for German language. Furthermore, I will describe how i deployed the final Model to serve a simple web application. This the final web application that will be (…)
Heteroscedasticity Analysis in Time Series Data
Practical Approach using NOx Concentration Data from Gas-Turbine-Based Power Plant Heteroscedasticity is a condition where the error variance is not constant on the independent variable. Whereas Homoscedasticity is a condition where a variance error is constant in any condition of the independent variable. The assumption of homoscedasticity is very important in terms of the linear regression (…)
Zestawienie linków przygotowuje automat, wybacz więc wszelkie dziwactwa ;-)