Applied Data Science (ADS)

ADS 500A | PROBABILITY AND STATISTICS FOR DATA SCIENCE

Units: 3 Repeatability: No

This course is an introduction to probability and statistical concepts and their applications in solving real-world problems. This prerequisite course provides a solid background in the application of probability and statistics that will form the basis for advanced data science methods. Statistical concepts, probability theory, random and multivariate variables, data and sampling distributions, descriptive statistics, and hypothesis testing will be covered. The use of computer-based applications for the performance of basic statistics will be utilized. Covered topics include the numerical and graphical description of data, elements of probability, sampling distributions, probability distribution functions, estimation of population parameters, and hypothesis tests. This course will combine the learnings from texts, case studies, and standard organizational processes with practical problem-solving skills to present, structure, and plan the problem as it would be presented in large enterprises and execute the steps in a structured analytics process.

ADS 500B | DATA SCIENCE PROGRAMMING

Units: 3 Repeatability: No

This course is an introduction to fundamental concepts of programming and problem-solving techniques for data science. Python and R are the languages used to analyze and deliver insights from real-world datasets. Topics include the basics of Python and R, data acquisition, integration and transformation, problem understanding, data preparation, standardization, and exploratory data analysis. In addition, command line tools and editors are explored in UNIX, and methods to access and analyze RDBMS databases are examined. The course ends with introducing students to the basics of machine learning models.

ADS 501 | FOUNDATIONS OF DATA SCIENCE AND DATA ETHICS

Units: 3 Repeatability: No

Prerequisites: ADS 500A with a minimum grade of C- and ADS 500B with a minimum grade of C-

This course covers an introduction to the methods, concepts, and ethical considerations found and practiced in the field of professional data science. Topics include defining and structuring the problem, managing the business, the CRISP-DM and Agile processes, ensuring the science in data science using the scientific method, project management, managing ethical concerns and model bias, and the importance of performing exploratory data analysis. This course will combine the learnings from case studies, texts, and standard organizational processes with practical problem-solving skills to present, structure, plan, and present the problem as it would be done in large enterprises, including executing steps in the data science work-stream.

ADS 502 | APPLIED DATA MINING

Units: 3 Repeatability: No

Prerequisites: ADS 500A with a minimum grade of C- and ADS 500B with a minimum grade of C-

Data Mining is one of the most important topics in the data science field. This course discusses theoretical concepts and practical algorithms for both supervised and unsupervised learning techniques. The course provides data mining principles, methods, and applications with a variety of integrated theoretical and practical examples in classification, association analysis, cluster analysis, and anomaly detection. This course also includes applied examples associated with each topic in data mining using R and Python programming languages.

ADS 503 | APPLIED PREDICTIVE MODELING

Units: 3 Repeatability: No

Prerequisites: ADS 500A with a minimum grade of C- and ADS 500B with a minimum grade of C-

This course provides a working knowledge of applied predictive modeling. Students will obtain a broad understanding of model training, evaluations, and development procedures with a wide variety of applications to real-world problems. This course introduces best practices for managing data science projects and presenting analytical results to technical and non-technical audiences. Course topics include linear and non-linear regression modeling methods, linear and non-linear classification modeling methods, model selection, variable importance, variable selection and model applications, code, and R package management using RStudio.

ADS 504 | MACHINE LEARNING AND DEEP LEARNING FOR DATA SCIENCE

Units: 3 Repeatability: No

Prerequisites: ADS 500A with a minimum grade of C- and ADS 500B with a minimum grade of C-

This course covers the study of supervised and unsupervised algorithms in the Machine Learning context. Emphasis on formulating, choosing, applying, implementing, and evaluating machine learning models to capture key patterns exhibited in cross-sectional data and longitudinal data. This course also discusses the considerations of model complexity interpretations and implementation in real-world applications using Python and associated packages. An introduction to Deep Learning is provided in this course.

ADS 505 | APPLIED DATA SCIENCE FOR BUSINESS

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C-

Data science skills are in high demand across a wide variety of industries. This course focuses on real-world use cases of data mining applications, including predicting consumer purchase behavior, brand loyalty, product prices, sales up-lift, basis of purchase, direct marketing campaign cost-effectiveness, rideshare cancellations, competitive online auctions, recommendation engines, and segmenting and identifying important customers. This course covers practical, business-oriented examples and use cases associated with each topic in data mining using Python. Data visualization, effective data storytelling, and analytical communication are being taught. Tableau, one of the most popular business analytics and dashboard tools, is practiced in this course.

ADS 506 | APPLIED TIME SERIES ANALYSIS

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C-

Many datasets naturally have a time series component: records collected over time, financial data, biological data signals such as brain waves or blood glucose levels, weather, and seasonal information. Practicing data scientists need to identify when they encounter time series data and when to apply suitable techniques. This course will cover the major topics in time series analysis and forecasting (prediction), including stationary and non-stationary models, autoregressive and integrated autoregressive models, models for estimation, and spectral analysis using R. Different methods of estimation will be leveraged, including maximum likelihood, Bayesian, and spectral estimation. These approaches will be applied to real-world datasets, culminating in a complete analysis from end to end.

ADS 507 | PRACTICAL DATA ENGINEERING

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C-

In this course, students will learn about the discipline of data engineering. They will learn what data engineers are, what they do and how it relates to the field of data science. Topics will include data architecture, relational databases, SQL, data pipelines (ETL and ELT), ethical data engineering (data security and privacy), and data engineering best practices.

ADS 508 | DATA SCIENCE WITH CLOUD COMPUTING

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C-

This course covers the fundamental concepts of cloud computing as it impacts the field of data science. Course topics include cloud economics, distributed storage, SageMaker ecosystem, distributed processing, model tuning, natural language processing, and model deployment considerations in the cloud. This course will combine the learnings from texts and relevant technical articles with practical hands-on skills to design, implement, and recommend solutions for the business problem as it would be presented in the business world, and execute the steps in a structured model development process.

ADS 509 | APPLIED TEXT MINING

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C-

This course focuses on natural language processing and data mining of text using Python. Topics include collecting and preparing text data, linguistic feature engineering, comparisons of groups of text, building classification models, sentiment analysis, topic modeling, and an introduction to vector-based representations of text.

ADS 550 | NEW STUDENT ORIENTATION

Units: 0 Repeatability: No

This orientation course introduces students to the University of San Diego and provides important information about the MS-ADS program and the technologies that will be used throughout the program. In the orientation, students will learn to successfully navigate through the online learning environment and locate helpful resources. Students will practice completing tasks in the learning environment as preparation for success in their online graduate courses. This orientation course will be available to students as a reference tool throughout the entirety of the program.

ADS 599 | CAPSTONE PROJECT

Units: 3 Repeatability: No

Prerequisites: ADS 501 with a minimum grade of C- and ADS 502 with a minimum grade of C- and ADS 503 with a minimum grade of C- and ADS 504 with a minimum grade of C- and ADS 505 with a minimum grade of C- and ADS 506 with a minimum grade of C- and ADS 507 with a minimum grade of C- and ADS 508 with a minimum grade of C-

The purpose of this Capstone Project is for students to apply their acquired theoretical knowledge obtained during the Applied Data Science Program to a research-based, code-oriented data science project. During the project, students lead the entirety of the end-to-end process that involves the collection and processing of the data while utilizing the appropriate analytical methods. The project will be documented in an academic journal style article and orally presented, including technical content, in a recorded presentation. Students will work in teams and are encouraged to find project topics that originate from real-world domains in order to tackle unique problem statements that have real-world impact.