10 Data Science Terms You Hear All the Time (But Might Not Understand)

Let’s finally decode the jargon that gets thrown around in every data science conversation

Jun 27, 2025

If you’ve ever been in a data science meeting, watched a tech talk, or just scrolled through LinkedIn posts, you’ve probably seen terms like “overfitting,” “model bias,” or “p-value” thrown around like everyone just knows what they mean. But let’s be real: a lot of these terms are confusing, and no one explains them clearly.

In this article, I’ll break down 10 common data science terms you’ve probably heard—but might not fully understand. And that’s okay. We’re going to fix that now.

Overfitting

What people say:
“This model looks great on the training data, but it’s probably overfitting.”

What it really means:
Overfitting is when your model learns the noise instead of the signal. It memorizes the training data too well—even the weird exceptions or outliers—and ends up doing badly on new, unseen data. It's like studying only past exam questions and failing the real test because it’s slightly different.

Why it matters:
A model that overfits might look perfect during training but fails in the real world. The goal is to generalize—not to memorize.

Underfitting

The opposite of overfitting.
This happens when your model is too simple to capture the patterns in the data. Imagine trying to fit a straight line through a bunch of points that clearly follow a curve. Your model won’t perform well on either training or test data.

Bias vs. Variance

What people say:
“There’s a trade-off between bias and variance.”

What it really means:

Bias: Your model is too simple and makes strong assumptions. It misses the real patterns (like underfitting).
Variance: Your model is too complex and reacts too much to small fluctuations in the training data (like overfitting).

Think of it like trying to throw darts:

High bias: All your darts land far from the bullseye in the same spot.
High variance: Your darts are all over the place, but sometimes hit the bullseye.

Good models balance both.

P-Value

What people say:
“This result is statistically significant because the p-value is less than 0.05.”

What it really means:
The p-value tells you how likely it is that your results happened by random chance. A low p-value (typically < 0.05) means there’s less than a 5% probability your result is just random noise.

But… it doesn’t mean the result is “definitely true.” It just suggests it’s probably not due to chance. Always interpret p-values with caution.

Feature Engineering

What people say:
“We improved the model a lot with better feature engineering.”

What it really means:
Feature engineering is the process of selecting, modifying, or creating variables (features) from raw data that help your model make better predictions.

For example:

Converting dates into “day of week”
Turning text into word counts or embeddings
Combining columns (like income and age)

It's less glamorous than deep learning, but often way more impactful.

Precision vs. Recall

What people say:
“This model has 90% precision and 60% recall.”

What it really means:
These two terms are common in classification problems:

Precision: Of all the things the model predicted as positive, how many were actually correct?
→ Low precision = lots of false alarms.
Recall: Of all the actual positives, how many did the model correctly find?
→ Low recall = it missed a lot.

Example:
In spam detection:

Precision = How many predicted spam emails really are spam?
Recall = How many actual spam emails did we catch?

There’s always a trade-off between them.

Confusion Matrix

What people say:
“Let’s look at the confusion matrix to evaluate the classifier.”

What it really means:
It’s a table that shows how well your model is doing, by comparing actual vs predicted labels. It breaks things down into:

True Positives (TP)
True Negatives (TN)
False Positives (FP)
False Negatives (FN)

Why it helps:
It gives you the full picture. Accuracy alone can be misleading, especially if your data is imbalanced (e.g., 95% of patients don’t have a disease).

A/B Testing

What people say:
“We’re running an A/B test to compare the new version.”

What it really means:
You randomly split users into two groups:

Group A gets the original version (control)
Group B gets the new version (treatment)

Then you compare metrics (like click rates) to see if the change actually helped. A/B testing is widely used in product design, marketing, and UX optimization.

Model Interpretability

What people say:
“This model performs well, but it’s not very interpretable.”

What it really means:
Interpretability is about understanding how and why a model makes its predictions.

Simple models like linear regression are very interpretable—you can see which features matter and how much. Complex models like deep neural networks or random forests? Not so much.

Why it matters:
If you’re building a model for credit scoring, healthcare, or hiring, stakeholders need to trust and understand the decisions.

Cross-Validation

What people say:
“We used 5-fold cross-validation to avoid overfitting.”

What it really means:
Cross-validation is a technique to test your model on multiple splits of the data. You divide the data into chunks (folds), train on some, test on the rest, and repeat. It helps you get a better sense of how your model will perform in the real world.

Why it's important:
It reduces the chances of accidentally getting a good score just because you got lucky with your train-test split.

If some of these terms were fuzzy before—good news! You’re not alone. Every data scientist starts with confusion, Googles the same terms over and over, and learns these concepts one step at a time.

The more you work on real projects, read about use cases, and experiment with your own models, the more these ideas will click.

And next time someone says “the precision is too low,” you’ll know exactly what they mean—and why it matters.

Data’s Substack

Discussion about this post