Project kick-off

2025-03-10

Rick Gilmore

Donald Rumsfeld’s unknown-unknowns and the Johari window, Wikipedia contributors (2025)

Agenda

  • A brief history
  • Overview of Aims
  • Next steps

A brief history

Databrary.org is a digital, non-commercial, academic data library housed at New York University (NYU). It is the world’s only repository specialized for storing, streaming, and sharing research video and linked identifiable data at scale.

  • NSF workshop (2011)
  • NSF (2012-2018); NICHD proposal (2013-201)
  • DB1 go-live (March 2014)
  • Alfred P. Sloan Foundation (2017), James S. McDonnell Foundation (2018) grants

DB1 growth through Feb 2025

Volume sharing status through Feb 2025

Citations through Feb 2025

Values

  • As open as possible
  • As closed as necessary

Vision

  • Video is an unparalleled source of data, documentation, and demonstration
  • Make Databrary the leader in open behavioral science
  • A tool for discovery

Overview of Aims

Aim 1: Enhanced data discovery

  • Why: Accelerate data reuse
  • How:
    • Enhanced search
    • Annotation layers: Search & visualize

Aim 2: Custom collections

  • Why:
    • Make data reuse easier
    • PLAY Project release
  • How:
    • Select, clone/copy shared files, sessions, metadata, volumes
    • Maintain virtual links
    • AKA “virtual volumes”

Aim 3: Workspaces

  • Why:
    • Active curation (Soska et al. 2021) \(>>\) post hoc
    • In progress \(\neq\) published/shared
  • How:
    • Workspace with essential DB2 metadata
    • Flexible views
    • Semi-automated curation for sharing

Aim 4: Scriptable access

  • Why: Transparency & reproducibility
  • How:
    • DB2 API
    • Secure authentication
    • Update databraryr
    • Develop & publish databrarypy

Aim 5: Curation

  • Why: Self-curation idiosyncratic
  • How:
    • Review fully shared, overview only, private
    • Curate

Timeline

Other opportunities

  • Overhaul, polish UI/UX
  • Institutional subscriptions (NYU TOV TAC 2025)
    • Institution dashboard, analytics
  • databrary.ai/videobrary.ai
  • {zoo,clinic,teach}brary
  • Web-based video annotation

Discussion

  • What do we know?
  • What don’t we know?
  • What’s important?
  • What’s urgent?

Known-knowns

Urgent Not Urgent
Important Upload; search; Montrose custom collections1; databrary{r,py}; curation
Not important UI cleanup

Also important/not urgent

Resources

Code

These slides were written in R Markdown and rendered using Quarto to HTML slides using the reveal.js framework.

References

Soska, Kasey C, Melody Xu, Sandy L Gonzalez, Orit Herzberg, Catherine S Tamis-LeMonda, Rick O Gilmore, and Karen E Adolph. 2021. “(Hyper)active Data Curation: A Video Case Study from Behavioral Science.” Journal of Escience Librarianship 10 (August). https://doi.org/10.7191/jeslib.2021.1208.
Wikipedia contributors. 2025. “There Are Unknown Unknowns.” Wikimedia Foundation, Inc. February 11, 2025. https://en.wikipedia.org/wiki/There_are_unknown_unknowns.