Data-Mining of Students

Here’s my notes on the New York State education data collection, sharing, and mining. At some point I will expand on these points. Last revised Jan 23, 2014. Here is an update from my local district’s recent Board of Education meeting where inBloom was discussed in public for the first time.

First, consider thinking about “data-mining” and other related terms like data-storage, data-collection, etc.  in terms of obtaining, storing, and sharing Personally Identifiable Info (Pii). When people hear data they think numbers and this is way more then just a bunch of number collection mechanisms.

Second, here is the handout I provide at public speaking forums (PDF format).

“The regime of testing has expanded in recent years, in the wake of No Child Left Behind, Race to the Top, and a belief that what goes on in a classroom can most accurately be divined by data.” 
– Rebeca Mead, The New Yorker  linked here

1. Background

– Stimulus Package (ARRA) –> RttT funded by ED Recovery Act under ARRA 2009 –> Accepting money leads to: CCSS adoption, new data projects, teacher/principal evaluation system (APPR),  but also many NCLB waivers.
– NY applies and is awarded the highest tier of money $700million and gets many NCLB waivers (3 years of no “failing schools” additions, etc.) We have to move ahead as a state with all these edu reforms, because we already took the money.
– What is a portal? An entrypoint to view data in one place.
– Nov 15, 2013 deadline…what dashboard did your school choose? The dashboards are supposed to pull data in from the inBloom cloud. inBloom (formerly knows as Shared Learning Collaborative) is the company NYSED contracted to store our state’s educational Pii (NYSED PDF contract here). inBloom hopes to have other vendors and ed-tech companies use the data to develop new products and solutions for district needs. The contract says they can’t sell it, but access is unclear at this point as is also unclear is what other vendors may be involved, is the data moving across state lines since it is stored in the cloud, would this have been covered under FERPA before the recent changes to the law, etc.
– Have districts and states always collected data? Yes. To this extent? Maybe. Have they always shared some of it with 3rd party vendors? Yes, some of it (like for scheduling and transportation). To this extent? No.
– inBloom is currently paid for by philanthropic services and the data dashboards are paid for by state (up to $50 million spent by the state as of Nov 2013.)
– Districts are essentially collecting data, sending it up the chain (usually through local BOCES), getting it to NYSED who will supply tech tools for the districts to be able to “see” the data. For now it’s free, but costs are coming … and sooner rather than later.
– Originally 9 states signed up with inBloom and NY is only state still all-in

2. Is all data mining bad? Is the cloud bad?

– Examples: health care, transportation industry, consumer purchases (grocery store, etc.), web site stats, etc.
– Difference between adults making decisions and trying to control their data v. kids not being able to make those decisions.
– You hear about “Big Data” in every industry, it is what every web based company and every big-box retailer, among others, is trying to figure out, from Facebook, Twitter, YouTube, to Amazon, Toys-R-Us, and Target. From K12 schools to higher ed. All must learn to use and analyze Big Data.
– We need to use data to make decisions. That is vital to the success of any organization. But, does it need to be so personally identifiable?
– Example: Why did the Obamacare web site not work on day 1? Among others issues, there were multiple database connection issues.
– Let education officials PROVE IT: why is this massive new data collection required and make them prove how it will benefit the learning that takes place in the classroom.
– Your use of the “cloud” includes email, Facebook, Snapfish, etc. The cloud, while not 100% secure, is more secure than local data storage, especially local paper based storage. But, because of the vast volumes of data all stored in one place, it makes it an ideal target for theft, corruption, and more.
– We hear lots about why none of this should be regarded as anything out of the ordinary. However, we hear a lot less on why all of this is needed and how it benefits those students in the classroom.
– Are districts clamoring for the likes of inBloom services? Is it a game-changer? Why exactly can’t a district manage their data locally or through a BOCES developed database system?

3. What are they (the state) looking to collect?

– The state has always asked for data and the districts have always sent some. Also already using cloud systems (email, transportation, scheduling, etc.)
– Estimated 350+ data points now in the NYSED Data Dictionary PDF (here) (here) (here) including at least 67 new  points (see the Y* entries in the first column of the PDF.)
– The state has never before been able to connect the dots (data points) to other systems, BOCES, schools, etc.
– What is new is the potential for 400 points of data because of how the data store is designed. Take a look at the entire list of inBloom data enumerations (PDF here). Is this all of what NYS is collecting? That is unclear and we need the state to specifically list out the elements they collect (mandatory) and those fields they encourage districts to submit (recommended). I assume the Data Dictionary above covers it for this current school year.
– From inBloom FAQ: “Common Education Data Standards (CEDS) lists more than 400 possible data fields that states and districts can choose to collect when using a service like inBloom.”
– NYSED wants: disciplinary records, parent contact info, single-parent household, every course taken with grades, etc.
– NYSED wants districts to make full use of the new inBloom data store and provide more data that is actually needed. That’s where the 400 fields come into play. The potential is there, but why? Illinois has nearly 35 districts participating in the inBloom project but their state has a district opt-out AND will not permit districts to send in health related data.
– What are districts looking (required) to collect? Good luck trying to find out.
– Updated Jan 24: 34 state ed chiefs (including King in NY) have sent letter to USDoE stating that they will not share Pii student data with federal gov’t  here

4. What is the bigger picture?

– Collecting educational data is part of a larger NY state P-20 Longitudinal Database Project see here
– This project (funded by RttT funds) is an intra-connected state system of databases all tied to one, new, unique child identifier that follows them for life. The state outlined that for us in their response to USDOE questioning (see link just above.)
– Goal to connect the educational data to DMV, Children’s Services, Dept of Labor, Dept of Tax & Finance, and more.
– The NY higher ed system has made some recent adjustments as well: SUNY Common App and SOAR (Why are they making parents enter all details about all HS courses?) “SUNY is strongly recommending that students self-report their high school grades and courses, as well as their ACT and SAT test scores using the SUNY Online Academic Record (SOAR).” Colleges always asked for a HS transcript, generally upon acceptance to the college, not the start of the application process. What happens to all that data if the student then decides to not attend a SUNY school? It’s all been uploaded … who takes it down?
– Cost: Who pays for this after RttT funds run out? Commissioner King clearly stated in the Nov 20th NYS Assembly Education Hearing on Student Data Protection/Privacy that districts will pay anywhere from $1-3 per student for the data dashboard (portals) and the inBloom contract (memo? FAQ?) with NY indicates it will cost districts $2-5 per student record starting in year 3 for inBloom (2015). Commissioner King also stated in the Assembly Hearing that as of that meeting date the State has spent roughly $50 million on just the data dashboard (portal) initiative.

5. What about the breaches?

– My experience at a local LI district and the current breach in Sachem (not related to inBloom). Those were inside breaches of small volumes of data. lists dozens of data breaches in EDU (45) during the 2013 calendar year and hundreds overall, most higher ed. Starting to see many K12 breaches listed on that site (12 K-12 related in 2013.) There were 86 reported EDU breaches in 2012.
– inBloom TOS states: “inBloom, Inc cannot guarantee the security of the information stored in inBloom or that the information will not be intercepted when it is being transmitted.” Section (here) under E. Breach Remediation State is responsible or individual district – not clear. The state seems to think this is ok. “Trust us.”
– A breach will happen (think Target, TJ Maxx,, etc.). You cannot guarantee that it will not. How will they respond? Who is held liable? How will they contain it?
– At the Nov 20th NYS Assembly hearing referenced above, Commissioner King was asked about breaches. What are penalties, who is responsible, etc. He replied that he wasn’t sure and he’d look into it. Great.

6. What about FERPA & HIPPA?

– Law setup to protect student and parent privacy.
– Revised 2009 and 2011. Strengthened? No. In the 2011 revision, at least 11 different “circumstances” that student information could be shared without parental knowledge or consent were revised or added.
– Not revised through votes. Executive order.
– FERPA appended in 2011: § 99.3 to define the term ‘‘education program’’ as “any program principally engaged in the provision of education.” That is the open door for education companies to gain access to student data.
– Many lawyers still debating the impact of the recent changes.
– After revisions, FERPA does not require that PII be stored within the US (think cloud based system with servers housed off shore). Read more here. NY’s contract with inBloom specifically states data must be stored within USA. How is this verified and who is consistently checking for compliance?
– The 2011 revisions to FERPA include: “The amendments add §99.31(a)(6)(ii) to clarify that FERPA does not preclude FERPA-permitted entities with access to PII from entering into written agreements with organizations conducting studies and redisclosing PII on behalf of the educational agencies and institutions (e.g., school districts and postsecondary institutions) that disclosed the information to the FERPA-permitted entities. In the Preamble to the final regulations, ED clarified that the redisclosure of PII under §99.31(a)(6)(ii) does not require the consent of the educational agencies and institutions that disclosed the PII to the FERPA-permitted entities.”  More on this here
– Districts have to make annual FERPA notice disclosure and let parents opt-out of student directory info from being shared.
– The understanding I have is that schools are not obligated to follow HIPPA restrictions because the health records are really “educational” records so FEERPA applies. I am not completely clear on this and have been trying to figure out the details as found here.

7. Now what?

– New lawsuit from NYC hearing date in late November (moved to Jan 2014) stipulates storing the data violates 1984 Personal Privacy Protection Law. Read more here and here.
– Sample questions to ask your district are below. ASK Qs, ASK Qs, ASK Qs, then question the answers, research, and ask more.
– NYS possible legislation on this issue (see big list of all education related legislation I could find here)
* Senator Flanagan’s Summary “Assessing the NYS Regents Reform Agenda” (here): Delay operation of the education data portal for one year and a new student privacy bill. Does it go far enough? No. Doesn’t question the volume of data being collected.
* NY Senate: Senator Martins S.5930, Senator Robach S.5932, Denator Flanagan S.6007
* NY Assembly: Assemblywoman Nolan A7872A, Assemblyman O’Donnell A6059A
– Work with privacy advocacy groups to identify problems beforehand. My duh moment. As far as I can tell, this did not happen at the NYSED level.
– When inBloom launched in the spring of 2013, there were 9 states involved. Now, Jan 2014, no other state is left all-in. So why bother? Transfer student records? Can’t be possible if fewer and fewer participate.
– So here’s the kicker with the state tests … where’s the data? No one outside of NYSED sees more than just the score. Why?
– We have to start taking a serious look at data collection and figure out what is needed and what isn’t. Is data being used to drive instruction? How? Has it worked or not?  PROVE IT!
– If there are graduation problems in certain areas or districts, why not collect data in just those areas? Why collect so much, so quickly on ALL public school NY children?
– Where was this massive data collection, storage, mining piloted? What were the results?
– Where is the state’s public hearing on the P-20 Project? Where is the open discussion of this?
– Where is the state oversight on this issue moving forward? Where is district oversight? Where is the annual security report? The implementation report? The ROI report? Who monitors inBloom and verifies they are in compliance with FERPA and other state laws?
– Look at the recently passed legislation in Oklahoma as example of something that can be adjusted (here) and (here)

* Teachers and principals are spending an inordinate amount of time gathering data to be input into systems for others outside their school system, to evaluate and use as basis for decisions. In-seat attendance time now being tracked. Why?

* Parents have no idea the volume of data being collected and analyzed. Until I get answers from either local or state officials as to what specifically they are collecting, I will fight the collection of such sensitive data points. There hasn’t been one education official who has been completely open and honest about this and that tells me something. It tells me they are hiding information from parents and the public for a reason.

– (here)
– Potential inBloom data fields (here)
– Actual USDOE web page listing the RTtT details summarized in my post (here)
– Why We Refused post (here)
– Hudson Valley inBloom preso here and Q&A here (fantastic read, both of them)
– NYSED Data Dictionary 2013-14/2014-15 (here) (here) (here)


Here are questions to ask your local school districts about data collection, storage, and mining:
1. Which data dashboard system has the District decided to use?
2. What was that rationale for choosing the particular system? What were the reasons for not choosing one of the other systems?
3. Is there a list of questions and answers the District presented to each vendor before selection? Can the school community see such information?
4. How will our students’ privacy be protected?
5. What data information is required vs what’s optional?
6. How long will this product be free for districts? How much will the cost be after that time period?
7. What are the long term plans for sharing this student data? How long does the state plan to hold on to it? With whom will they allow it to be shared? How do parents see what was shared, or will be shared? How will corrections be made?
8. Can District parents and students “opt out” of the collection and storage of personal information in education databases associated with the state’s implementation and/or requirements of the federal RttT program? If so, what is the process? If not, why not?
9. What exact data has been shared with the NY State Education Department or officials last year and this year and what are the plans for next year?
10. Has the District put any limitations on the type of data that you will collect in the future?
11. What safeguards has the District placed on both the electronic student data it stores as well as any and all paper based documents?
12. Is the District aware of any NY State data collection and transmission requirements that were implemented for this current, or the next, school year? Are there any new pieces of data (referred to as data points) that the District plans, or is required, to send to State database(s)? Can you elaborate in any detail on the specifics of this new state data initiative?
13. Has the district contacted any NYSED official to discuss data collection, storage, mining? If so, what was asked and what were the responses?
14. What is you understanding of how the inBloom system will work? How will this benefit our district?
15. Who is liable for a data breach of district information stored with inBloom, or used by a third party vendor through inBloom?
16. With whom does the state share [insert district name here] data? With what state agencies? With what third party vendors now and in the future? How will that be tracked and how will parents be notified? What is the educational purpose of sharing the data with other state agencies, if that is taking place, and with various third party vendors? Who approves those contracts? Do the [insert district name here] attorneys get to review those contracts?
17. How long does the state plan to hold data on [insert district name here] students? Until graduation? Through college? For life? What happens to the data of a student who leaves the district? Of a family who leaves the district?
18. Did [insert district name here] offer up to NYSED the ability to voluntarily share student data with vendors necessary for state use of the data, but not necessarily district use?
19. What specific data does [insert district name here] send to NYSED and how is it transmitted? Literally spell out the fields for everyone to see (much like is now law in Oklahoma.) That should be easy to do since a student, teacher, principal record can easily be extracted to see fields and personally identifiable information (Pii) stripped from the records. This would be any and all data sent under various portals, systems, spreadsheets, and reporting mechanisms.
20. What new data fields is [insert district name here] required to send this year and next? What new data elements is the state collecting specifically as a result of using inBloom or because of the technological capability that inBloom provides? What about in the future?
21. If [insert district name here] did opt out of RttT, and one of the requirements of RttT was to send parent contact info, is that data still sent to NYSED and if so, for what purpose?
22. There is a provision in the state’s contract with inBloom that clearly states that it is up to the district to participate in the inBloom initiative AND the district can withdraw from inBloom and request that data be removed. That’s the loophole used by Southold, Comsewogue, and others. What is [insert district name here] attorney’s view of that loophole or wording in the state’s contract with inBloom?
23. inBloom Inc. states that it “cannot guarantee the security of the information stored, or that the information will not be intercepted when it is being transmitted.” Please detail any communication between representatives from inBloom, the New York State Education Department, and [insert district name here] BOE members or administration regarding this statement.
24. Please cite the specific wording and definitions in both federal and state law that requires data collection. Please also cite the specific pieces of data these laws are requiring [insert district name here] to collect and send to the state.
25. What happens if a student transfers from one district to another within the state? What data is sent electronically to the new district without parent consent? What about medical records associated with special needs students? What happens if the student transfers out of state, to a non-inBloom state?
26. Why does all the state data collected about students need to be personally identifiable to each individual student? Please provide evidence and educational rationales that support that decision.
27. Please explain how the new inBloom database/system fits into New York’s larger plans for a statewide P-20 Longitudinal Database System. What is the educational value of the statewide longitudinal databases? Where has one been setup, tested, and used for an extended length of time (a decade or more) and found to be a valuable tool in making statewide decisions? What is the direct educational benefit of a P-20 system to the [insert district name here] community?
28. Is the state held [insert district name here] student data held within systems that are maintained within the United States or is the data stored on servers outside of America? What specific laws govern the storage, transmission, use, and maintenance of the data stored overseas, if it is? If a data breach happens, that involves [insert district name here] data, on a system stored overseas what legal implications does that present to the district?
29. When the inBloom initiative began last spring, there were 9 states committed to some part of the process. Not even a year later and there are only 2 standing. So this begs the simple question: What do these 7 others states know that we don’t? Why have they withdrawn?
30. Who verifies that the state remains in compliance with all data privacy protection laws and the inBloom contract/agreement?
31. How is the inBloom and data dashboard project funded in the short term and what are plans for long term costs and funding? What will this cost [insert district name here] because it appears that even [insert district name here] withdraw from RttT and choose not to use the dashboard, [insert district name here] still has to pay the state for the data storage at up to possibly $3-5 per student. In essence, [insert district name here] is paying the state to see the data we sent them.

Lastly: 32. What specific data has been shared about your child with what third parties, and over what period. (You have this right according to the state’s Personal Privacy Protection Law, and the state is supposed to respond within five working days.)

Leave a comment


  1. Thank you so much for all this valuable information. Please join us in our efforts to protect the children of NY State and privacy rights by signing and sharing this petition

  1. Why We Refused the NYS Assessments | The "999"ers: Something is not right.

Have something to say? Please take a moment to comment. Thanks.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: