California Community Colleges Data Warehouse
Patrick Perry, Vice Chancellor of Technology, Research,
and ISCalifornia Community
Colleges Chancellor’s Office
Overview
California’s Educational Data Quagmire
The CCC System-All About Us Data Collection Methodology and
System-Getting the Data Data Dissemination Systems-Using
the Data
Educational Data in California
Typical Silos: UC-Unitary Student, Enrollment
Summaries CSU-Unitary Student, Enrollment
Summaries CDE (K-12)-Student/Institutional
Summary, no enrollment CCC-Unitary Student, Unitary
Enrollment CPEC-gets summary extracts from all
The CCC System
108 Community Colleges 71 Districts, locally governed 1.8 million Fall, 2.9 million year
undup. Students Largest postsecondary system in
the world $11 per credit, no entrance
requirements
Data Collection
In 1985, the Legislature said “Let There Be Data” Data “good” since 92-93
CCCCO (Sacramento) is mandated to collect data from Districts
Uses of Data
Funding (Mandate--legacy of K-12 based funding scheme)
Policy Analysis Research Accountability PR, Spin, and Advocacy
What is Collected?
Well-defined and stable Data Element Dictionary
http://www.cccco.edu/divisions/tris/mis/dedmain.htm
Collected in local systems, sent to CO, stored in 3rd normal form
COMIS Data Submission: Timeline (or…why we are the bane of Susan Broyles’ existence…)
AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL
EMPLOYEE ACTUAL(Due Aug 1)Employee Demographic FileEmployee Assignment File
SUMMER TERM ENDStudent Basic FileStudent Enrollment FileCourse FileSection/Session/ Assignment FileStudent Matriculation FileStudent Disability FileStudent EOPS FileStudent Precollegiate Basic Skills FileStudent VATEA File
FALL TERM ENDStudent Basic FileStudent Enrollment FileCourse FileSection/Session/ Assignment FileStudent Matriculation FileStudent Disability FileStudent EOPS FileStudent Precollegiate Basic Skills FileStudent VATEA File
SPRING TERM ENDStudent Basic FileStudent Enrollment FileCourse FileSection/Session/ Assignment FileStudent Matriculation FileStudent Disability FileStudent EOPS FileStudent Precollegiate Basic Skills FileStudent VATEA File
COLLEGE CALENDARCollege Calendar File
EMPLOYEE CENSUS(Due Nov 1)Employee Demographic FileEmployee Assignment File
ANNUAL(Due Oct 1)Program Award FileFinancial Aid FileAssessment File
How is it Collected?
71 districts, 71 different MIS/ERP systems Colleges must “push” data to us in
DED format Colleges submit ASCII flat files to us Master Database: NCR Teradata Weekly update to mirrors and marts
(MS-SQL)
Data Integrity
Submission process: 1. Syntactical Edit 2. Referential Edit 3. Load Processing Feedback
Data Integrity
4. Detail/Summary/Analysis Reports
http://www.cccco.edu/divisions/tris/mis/submission.htm
Data Integrity
5. Public Humiliation by Reporting Ie… no “Leonardization”
6. Fund off of it…that cleans things up real fast
What Else Can We Throw In The Warehouse? External Data Matches:
Transfer-CSU, UC, Student Loan Clearinghouse…annual transfers and cohort tracking
Wage Data- EDD match for “leaver cohorts”
Social Services: DSS match to see who’s on assistance
CDE: SAT-9 scores for HS test takers
We Have The Data…
Now Let’s Do Something With It. The Data Mart The Cohort Study (SLOTS) The Expanded SRTK Files The Accountability Program The Brio Ad-Hoc Warehouse
www.cccco.edu
Data Mart • Public site• Online query tool
• Create ad hoc queries• Aggregate data• Download queries in csv format
• Reports• Updated as data are submitted or
resubmitted• Download into CSV format
Chancellor’s Office
The Cohort Study (SLOTS)
Sudent Longitudinal Outcomes Tracking System
http://www.cccco.edu/divisions/tris/mis/srtk.htm SRTK Rates: Completions & Transfer FTF Student Cohort Tracking
Cohort Demographics Awards Transfer
The Expanded SRTK Datasets Comma-delimited relational
dataset; cohort study of FTF students Cohort table: demographics Enrollment table: enrollments Awards table: Awards conferred Transfers table: Transfers
The Accountability Program: Partnership For Excellence http://www.cccco.edu/divisions/tris/
rp/pfe.htm Transfers Xfer Directed/Prepared/Ready Annual Certificates & Degrees Successful Course Completion Basic Skills Improvement
The Brio Ad-Hoc Warehouse
Internally and Selectively Externally accessible data warehouse containing 3NF, summary, and mart files
Just a SQL Server at an IP address, password protected
Connectivity is by ODBC
How Much?
Annual Cost of Teradata: Hardware lease: $80k Maintenance: $50k
SQL Servers: $5-30k Assorted Software (Brio, SAS,ColdFusion) Staff:
1 Teradata DBA, 1 SQL DBA, 2 Programmers, 1 IPEDS Coordinator, 1 Submissions Coordinator…and me.
The Future
CALPASS: Regional Data Sharing Consortia Enrollment-Enrollment data collection
done regionally, stored centrally Used for Program Evaluation &
Curriculum Alignment Bottom-Up Approach