Michigan Tech Experts Model the Future of Data Science

Data science drives modern decisions, but its true engine—data science modeling—often remains unseen. At its core, data science is about using information to make better decisions, whether you’re predicting market trends or improving customer experiences. This practice of data science modeling turns raw data into actionable strategies.

Technologies like LLMs and AI accelerate the process of gathering massive amounts of information and extracting meaning, insight, and predictions from data. These tools fuel AI-driven insights and enhance predictive analytics, making decision intelligence more accessible than ever. As a result, data science modeling becomes the bridge between raw information and smart choices.

What Is Data Science Modeling and How Does It Differ From General Data Science?

While general data science covers the entire pipeline—from collecting and cleaning data to visualizing results—data science modeling zeroes in on one critical stage. It focuses specifically on the mathematical and algorithmic frameworks that turn raw data into actionable predictions. Think of it as the engine inside the car: the data may be the fuel, but the model is what drives you forward.

Data science modeling - real-life example
Bild: This_is_Engineering / Pixabay

Data science modeling is essentially a subset of data science centered on creating and optimizing those engines. You still work with data, but your goal is to build, train, and refine machine learning models that can recognize patterns and make decisions. As Felicia Huffman describes, this is the “science behind” LLMs and AI. Without solid modeling, even the cleanest dataset remains just a bunch of numbers.

A practical example helps clear things up. When a streaming service suggests a movie you might like, general data science handled the data collection—tracking what you watched, when you paused, and what you skipped. But the actual recommendation? That’s predictive modeling at work. The model uses algorithmic frameworks to learn your preferences and predict what you’ll enjoy next. So while general data science asks “what happened?” and “what does the data show?”, data science modeling asks “what will happen next?” and “how can we make better predictions?”.

Michigan Tech’s Data Science Experts: Timothy Havens and Sujan Kumar Roy

That shift from looking backward to looking forward is exactly where Michigan Tech’s research strengths come into play. Professors Timothy Havens and Sujan Kumar Roy are the driving forces behind the university’s work in data science modeling, pushing the boundaries of how you can build and refine predictive systems.

Inspiration for Data science modeling
Bild: WOKANDAPIX / Pixabay

Timothy Havens holds the title of William and Gloria Jackson professor of computing. In this role, he focuses on developing algorithms that make data science modeling more efficient and reliable. His expertise touches on everything from machine learning techniques to large-scale data analysis, helping you understand not just what a model predicts, but how it gets there. Havens brings a practical approach to academic data science, ensuring that theoretical advances translate into real-world tools.

Sujan Kumar Roy serves as an assistant teaching professor of computer science in Michigan Tech’s College of Computing. Roy’s work in data science modeling emphasizes applied problem-solving. He focuses on creating models that handle messy, real-world data—the kind you encounter when dealing with sensor readings, user behavior, or streaming information. Roy’s background in professors in computing means he bridges the gap between classroom concepts and industrial applications. Together, Havens and Roy combine decades of research expertise to tackle complex modeling challenges. Their work gives you a clearer picture of how data science modeling evolves from an abstract idea into a trustworthy prediction engine.

Modeling the Future: How Havens and Roy Are Shaping Data Science Trends

Their work on large language models (LLMs) and artificial intelligence pushes the boundaries of what data science modeling can achieve. You might think of LLMs as tools for chatbots or text generation, but Havens and Roy see them as something bigger. They explore advanced techniques that integrate LLMs with traditional models, creating a hybrid approach that handles both structured data and unstructured text. This combination allows you to build models that understand context, nuance, and intent — not just numbers in a spreadsheet.

Ideas around Data science modeling
Bild: Janson_G / Pixabay

Technologies like LLMs and AI accelerate the process of gathering massive amounts of information and extract meaning, insight, and predictions from data. Instead of spending days cleaning and labeling data manually, you can use these tools to surface patterns you might otherwise miss. Their research anticipates future trends in automated decision-making, where models don’t just predict outcomes but also recommend actions in real time. This is the future of data science: systems that learn, adapt, and act without constant human oversight.

For you, this means practical benefits. Whether you’re building a recommendation engine or forecasting demand, LLM integration can make your models more responsive. Havens and Roy’s work shows that the next wave of AI trend prediction isn’t about replacing human judgment — it’s about augmenting it with faster, more accurate data science modeling.

Real-World Data Science Modeling: Underwater Bioacoustics and Medical Imaging

That theoretical power becomes tangible when you see undergraduates at Michigan Tech applying data science modeling to real-world problems. These aren’t textbook exercises — they’re messy, complex challenges that demand practical, hands-on data science skills.

Data science modeling: michigan tech
Bild: geralt / Pixabay

Take Felicia Huffman, who works with Evan Lucas on an underwater bioacoustics project. She programs machine learning models to identify fish sounds — a task that requires sorting through hours of noisy underwater recordings. For you, this highlights how data science modeling can extract meaning from chaotic data. The models must learn to distinguish fish calls from boat engines, waves, and other aquatic noise. It’s a classic pattern recognition problem, but one with real ecological stakes. Huffman’s work shows that effective modeling often means starting with messy, real-world data and building systems that can handle it.

Then there’s Diana Shadibaeva, an undergraduate research assistant in the University’s Laboratory of Medical Imaging and Informatics. She applies data science modeling to medical image analysis — a field where accuracy can directly impact patient care. Medical images like MRIs and CT scans generate massive datasets, and models must learn to spot subtle anomalies that human eyes might miss. For you, this is a clear example of how hands-on data science training prepares students for high-stakes work. Shadibaeva isn’t just learning theory; she’s building models that could help doctors make faster, more reliable diagnoses.

Both projects share a common thread: they rely on underwater acoustics modeling and medical image analysis techniques that require both technical skill and domain knowledge. You can see how the principles from Havens and Roy’s work — like making models more responsive to real-time data — apply directly here. Whether it’s identifying fish calls or analyzing medical scans, the goal is the same: turn raw data into actionable insights through practical, reliable modeling.

Overcoming Challenges in Data Science Modeling: From Audio to Images

That practical focus on turning raw data into insights is especially critical when the data itself is messy. Modeling noisy information — like fish sounds recorded underwater or complex medical scans — demands techniques that go beyond standard approaches. You need specialized methods to extract reliable patterns from signals that are anything but clean.

Take audio data, for example. Underwater bioacoustics projects face constant hurdles from background noise and variable signal quality. Huffman is gaining hands-on experience working with Evan Lucas on exactly this kind of problem: programming machine learning models to identify fish sounds. The core challenge here is audio signal processing — filtering out engine rumble, wave noise, and other interference so the model can focus on the actual fish calls. Effective noise reduction is a must, and the team relies on practical preprocessing steps to clean the audio before feeding it into the model. This isn’t just an academic exercise; it helps researchers monitor aquatic ecosystems without disturbing them.

Medical imaging presents a different but equally demanding set of obstacles. Models must handle high-dimensional data — think thousands of pixels per image — while maintaining clinical accuracy. Diana Shadibaeva, an undergraduate research assistant in the University’s Laboratory of Medical Imaging and Informatics, works on projects that require precise image segmentation. That means teaching a model to outline organs, tumors, or other structures in a scan, pixel by pixel. Any error could affect diagnosis, so the modeling has to be both reliable and interpretable. Techniques like convolutional neural networks help, but you still need careful validation to ensure the model generalizes across different patients and imaging equipment.

Whether you’re dealing with underwater audio or medical images, the principles of data science modeling remain grounded in the same practical steps: clean your data, choose the right architecture, and test rigorously against real-world conditions. That’s how you turn noisy, complex inputs into trustworthy, actionable insights.

Frequently Asked Questions

How can you get hands-on experience with data science modeling as an undergraduate at Michigan Tech?

You can work directly on real-world projects, such as building models for underwater bioacoustics or analyzing sensor data. Professors like Havens and Roy often involve undergraduates in their research teams, giving you practical, step-by-step experience. This approach helps you learn how to apply data science modeling to solve concrete problems.

What exactly is data science modeling, and how is it different from general data science?

Data science modeling is the specific process of creating mathematical frameworks that learn patterns from data to make predictions or decisions. General data science includes the broader workflow of collecting, cleaning, and exploring data before modeling even begins. In short, modeling is the core analytical engine, while data science covers the entire pipeline from raw data to actionable insight.

Why is data science modeling considered the science behind LLMs and AI?

Large language models and AI systems rely on sophisticated data science modeling to process and generate human-like text. These models are trained on vast datasets using statistical techniques to predict the next word or identify patterns. Without robust data science modeling, LLMs would lack the structure needed to produce coherent and useful responses.


Add Comment