
Introduction to Python for Data Analysis: Basic Libraries
PYTHONEN-US
Lucas Lumertz
8/1/20252 min read


Have you ever wondered how data scientists, companies, and even apps like Netflix and Spotify manage to transform numbers and information into smart decisions? The answer lies in a language called Python and its incredible libraries for data analysis!
Today, I'm going to show you how Python works in this world and what the most important libraries are to get started. Everything is explained in a super simple way. Let's go!
What is Python?
Python is a programming language that is easy to learn and very powerful for working with data. But alone, it doesn't do magic—that's why we use libraries, which are like ready-made "toolboxes" for specific tasks.
Think of it this way:
Python → It's like a pencil.
Libraries → They are the colors, rulers, and erasers you use with it to draw incredible things.
What Is Python Used for in Data Analysis?
With Python and its libraries, you can:
✔ Read and organize data (like a gigantic list of sales).
✔ Perform complex calculations (averages, percentages, trends).
✔ Create charts and visualizations (to easily understand the data).
✔ Predict results (like knowing if a product will be a success).
Example: If you have a spreadsheet with a school's grades, Python can help you calculate the class average, see who passed, and even show a chart with the overall performance.
Why Is Python Important?
Python is the most used language for data analysis because:
🔹 It's easy (it looks like plain English!).
🔹 It has powerful (and free!) libraries.
🔹 It works in every field (health, finance, marketing, sports).
🔹 Everyone uses it (Google, NASA, Netflix, banks).
Without Python, analyzing data would be like trying to cut paper with a spoon—it's possible, but much harder!
The 4 Basic Libraries to Start:
Below, I'll list some libraries that, in my opinion, are super simple to learn.
1. Pandas → The excel of Python
What is it used for? Reading, filtering, and organizing data into tables.
Example:
import pandas as pd
data = pd.read_csv('sales.csv') # read a sheet
print(data.head()) # Show the first lines
2. NumPy → The Scientific Calculator
What is it used for? Performing fast calculations and working with numbers.
Example:
import numpy as np
numbers = np.array([1, 2, 3, 4, 5])
print(numbers.mean()) # Calculates the average → 3.0
3. Matplotlib → The Chart Generator
What is it used for? Creating charts to visualize data.
Example:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4], [10, 20, 25, 30]) # Generates a line chart
plt.show()
4. Seaborn → The Beautiful Charts
What is it used for? Creating more elegant visualizations than Matplotlib.
Example:
import seaborn as sns
sns.barplot(x=['A', 'B', 'C'], y=[10, 20, 15]) # Bar chart
plt.show()
Examples of Use Cases
1. E-commerce (Amazon, Mercado Livre):
They use Pandas to analyze sales and NumPy to calculate promotions.
2. Health (Hospitais, Pesquisas)
They use Matplotlib to show charts of disease growth.
3. Sports (FIFA, NBA)
They use Seaborn to compare player statistics.
4. Finance (Banks, Bitcoin)
They use Pandas to predict market trends.
Conclusion
Let's recap what we learned, everyone!
🔹 Python + libraries = superpower for analyzing data.
🔹 Pandas organizes data into tables.
🔹 NumPy performs fast calculations.
🔹 Matplotlib/Seaborn create charts.
🔹 It's used in almost everything (from Netflix to hospitals).
If you want to enter the world of data, starting with Python is the best choice! And the best part: you can test everything for free in Google Colab or Jupyter Notebook.
So, are you ready to run your first analysis? If you have any questions, leave them in the comments! 🚀
📌 Want more content like this? Follow me so you don't miss the next posts!
