From Scala to Python - Python dataclasses

There are two things I missed when I started working with Python after three years of writing Scala code: types and immutability. Fortunately, it turned out that I missed them only because I did not know Python well enough. There is a way of getting some of that functionality in Python without using external libraries!

Spoiler alert. Don’t expect too much. It won’t be like Scala types ;)

Data classes

Let’s start with types. Using data classes, it is possible to specify the type of a field in a class. The most basic usage looks like this:

1
2
3
4
5
6
from dataclasses import dataclass

@dataclass()
class User:
  name: str
  age: int

Seems to be good, but look what happens when I try to assign a string to the age field.

1
2
3
>>> u = User('Test', 'aaa')
>>> u
User(name='Test', age='aaa')

It works! It should not work! There is a more powerful way of defining types which does not work either.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from typing import NewType
Name = NewType('Name', str)
Age = NewType('Age', int)

@dataclass
class User:
  name: Name
  age: Age

>>> User('Name', 123)
User(name='Name', age=123)

>>> User(Name('Test'), Name(1555))
User(name='Test', age=1555)

I am so disappointed.

Functions

At least, we can specify the expected type of a function parameter and the type of the returned value! Can we?

It makes no sense, because still, nothing stops me from misusing it…

1
2
3
4
5
def name_to_age(name: Name) -> Age:
  return Age(123)

>>> name_to_age(Age(50))
123

Immutability

Fortunately, there is one thing which works as expected. Python elegantly solves the problem of immutability. All we need to do is adding a parameter to an annotation.

1
2
3
4
5
6
7
8
9
10
11
@dataclass(frozen = True)
class User:
  name: str
  age: int

>>> u = User('Test', 123)
>>> u.name = 'Another user'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 3, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'name'

At least something works. What about types? It seems that I can use them only as a part of the documentation. For data validation, https://pydantic-docs.helpmanual.io library must be used.

Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?

Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!

Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.

Bartosz Mikulski

Bartosz Mikulski

  • Data/MLOps engineer by day
  • DevRel/copywriter by night
  • Python and data engineering trainer
  • Conference speaker
  • Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
  • Twitter: @mikulskibartosz
Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.