mikulskibartosz.name
Start here
About me
Twitter
Mastodon
mlops.today
Bartosz Mikulski
Building trustworthy data pipelines because AI cannot learn from dirty data
All Stories
How to choose the right mini-batch size in deep learning
How to deal with underfitting and overfitting in deep learning
The lessons learned from Andrew Ng’s online course
How to reduce memory usage in Pandas
Fit more data in the same amount of memory
How Airflow scheduler works
Explanation of the Airflow interval and start_date parameters
Guidelines for data science teams — a summary of Daniel Molnar’s talks
Avoiding over-engineering in machine learning
Ludwig machine learing model in Kaggle
My first attempt to use Ludwig
The problem of large categorical variables in machine learning
How to use FeatureHasher in Scikit-learn
Encoding categorical variables in machine learning
One-hot encoding, dummy coding, and effect coding in Scikit learn and Pandas
How To Avoid Data Leakage While Building A Machine Learning Model
What to do when your model works perfectly during testing but fails in production
Using scikit-automl for building a classification model
My first attempt to use scikit-automl and how I got it working
How to return rows with missing values in Pandas DataFrame
How does it work and why the most popular solution is wrong
Preprocessing the input Pandas DataFrame using ColumnTransformer in Scikit-learn
How to encode text/categorical variables and scale numerical values using only one Scikit-learn class
« Prev
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Next »