Artificial intelligence (AI) is not a new kid on the block anymore and the field is developing at a constantly increasing pace. Pretty much every day there is some kind of new development, be it a research paper announcing a new or improved machine learning algorithm, a new library for one of the most popular programming languages (Python/R/Julia), etc.
In the past, many of those advances did not make it to mainstream media. But that is also changing rapidly. Some of the recent examples include the AlphaGo beating the 18-time world champion at Go , using Deep Learning to generate…
As the old adage goes, a picture is worth a thousand words. In data science, we often use different kinds of plots to either explore the data or tell a story, frequently to a non-technical audience.
Data visualization is often considered a separate skill that needs dedicated time and effort to master. Pretty much anyone can run a one-liner to create a simple plot. But there are many techniques or details that can vastly improve the graphic and reinforce the message we want to convey with it.
If you have ever ventured into any of the cryptocurrency exchanges, I am sure you have already seen a depth chart, just like the one in the image above. In this article, I wanted to quickly talk about what a depth chart actually is, what kind of information we can infer from it, and then show how to create one using Python.
Please bear in mind that this is an article focusing on obtaining the order book data and creating appropriate visualizations. It is not a piece of investment advice!
A depth chart is a kind of visualization that informs…
In this article, I will briefly describe what decision forests are and how to train tree-based models (such as Random Forest or Gradient Boosted Trees) using the same Keras API as you would normally use for Neural Networks. Let’s dive into it!
I will get straight to the point, it is not another fancy algorithm like XGBoost, LightGBM, or CatBoost. Decision forests are simply a family of machine learning algorithms built from many decision trees. That includes many of your favorites like Random Forest and various flavors of gradient-boosted trees.
Until now there was a clear split between machine and…
Recently I was doing EDA using
pandas-profiling and something piqued my interest. In the correlations tab, I saw many known metrics I have known since university — Pearson’s r, Spearman’s ρ, and so on. However, among those I have seen something new — Phik (𝜙k). I have not heard about this metric before so I decided to dive a bit deeper into it.
I have been using Medium regularly since 2018 and I have to admit that I bookmark quite a lot of articles. Given the platform’s and its key publications’ impressive growth in the last years, there is just tons of great articles to read. I often see something that piques my curiosity in one of the email newsletters I receive, on Twitter, LinkedIn, etc. Most of the time, such an article goes straight to backlog before I can find some time to sit down and read a few of those.
To put a number to my article hoarding habit, my current…
Documentation — undoubtedly one of the crucial tasks of every data scientist, yet most likely also in the lowest ranks in terms of how enjoyable it is. I will not try to persuade you about the benefits of keeping an up-to-date documentation, that is a topic for another time.
In this article, I will show you a tool that can help with making the process much faster, more efficient, and even enjoyable. After all, a picture is worth a thousand words. …
Inflation — the word we hear in the news pretty much on a daily basis. We know that, long story short, inflation means that our money is worth less over time. But how much less and how to adjust the values for inflation? I will answer those questions in this article by showing how to work with inflation in Python. But first…
I won’t spend much time writing about the economics theory for inflation and its consequences, as this is a topic for a much longer article with a different focus. To define inflation in one sentence — it is…
It is a well-known fact that
matplotlib is very versatile and can be used to create pretty much any kind of chart you want. It might not be the simplest or prettiest, but after viewing enough questions on StackOverflow it will most likely work out quite well in the end.
I knew that it is possible to create financial plots such as a candlestick chart in pure
matplotlib, but that is not the most pleasant experience and there are much easier ways to do it with libraries such as
altair (I covered this in another article). However, only…
In this article, I wanted to quickly show a few useful
pandas methods/functions, which can come in handy during your daily work. To manage expectations, this is not an article showing the basic functionalities of
pandas and there is no particular theme to the methods. Without further ado, let’s start!
There are many ways of inspecting whether a Series/DataFrame contains missing values, including dedicated libraries such as
missingno. A simple way to check if a column of a DataFrame contains missing values could look as follows:
Alternatively, we can use the
hasnans method of a pd.Series …