This is an excellent overview of data issues. My only concern is that it places data at the heart, when data (or more generally, observations) are subordinate to the explanations that drive the observing. I suspect you agree at some level, given your intent to discuss synthetic data and their underlying generative bases. On closer examination, all aspects depend on explanatory foundations: consider aspects such as bias, priors, background knowledge, search strategies, etc.
My concern is that the underlying theories and explanations may be more elusive, particularly if people think it obvious that access data is the key to success, as opposed to one factor among many (as you presented it).
Thanks for your article.