Charter
Project Charter & Roadmap
Mission Statement
Open Embeddings aims to create a sustainable, standardized way to lower the barrier to entry for new AI agents, scripts, and services while reducing the pressure on content providers from repeated re-embedding operations across multiple models.
Vision
We envision an internet where content discovery is AI-native by default, where:
- Content providers can expose their material efficiently without bandwidth waste
- Developers can build semantic search applications without re-embedding costs
- End users can discover relevant content across platforms and models seamlessly
- The open internet remains accessible and avoids capture behind walled gardens
Project Goals
When you leave this presentation, you will hopefully understand the thesis and some of you will further:
- Define a great Open Format / Spec for content-providers to leverage multiple models
- Update commonly used content publishing tools to support the format securely
- Generate a corpus of distributed cross-space materials to allow transitions between closed models and open model encoded data
Organizational Structure
Non-Profit Foundation
- Run as a non-profit organization funded through donations and open-source grants
- Best-in-class administrative fee structure for any paid work
- Community volunteer-driven development for core functionality
- Seek partnerships with organizations sharing our vision of an open, accessible internet
Target Partners
- Mozilla and Google - Initial outreach for website and RFC development support
- Tech companies committed to open standards
- Educational institutions researching AI and semantic web technologies
- Non-profits focused on open-source software and AI ethics
Development Roadmap
Phase 1: Foundation (Q1 2024)
Status: In Progress
- Define core problem statement and solution approach
- Create project documentation and website
- Draft initial RFC specification
- Build reference parsers in Python and JavaScript
- Establish community feedback channels
- Create validation tools and testing frameworks
Success Metrics:
- RFC draft complete and published
- Reference implementations available
- Community engagement initiated
Phase 2: Adoption (Q2-Q3 2024)
Status: Planning
- Integrate with popular CMS platforms (WordPress, Drupal, Jekyll)
- Partner with content creators for pilot implementations
- Develop browser extensions and developer tools
- Create interactive demos and educational content
- Submit to standards bodies for formal review
Success Metrics:
- 10+ websites implementing Open Embeddings
- CMS plugin ecosystem established
- Standards body engagement initiated
Phase 3: Scale (Q4 2024)
Status: Future
- Address hard implementation problems (model sprawl, cache invalidation)
- Research cross-model embedding transformation frameworks
- Develop enterprise-grade security and performance features
- Create distributed embedding network protocols
- Establish certification and compliance programs
Success Metrics:
- 100+ websites using Open Embeddings
- Academic research partnerships established
- Industry adoption by major platforms
Technical Priorities
Hard Implementation Problems
- Model Sprawl
- Research recent academic work on embedding space transformations
- Develop framework for multi-modal model compatibility
- Create standardized APIs for model conversion
- Cache-Invalidation
- Design trust mechanisms for embedding freshness
- Implement content change detection systems
- Develop distributed validation networks
- Performance Optimization
- Compression algorithms for embedding vectors
- Pagination strategies for large content sets
- CDN integration and caching strategies
- Security & Privacy
- Prevent sensitive information leakage in embeddings
- Access control mechanisms for private content
- Audit trails and compliance monitoring
Community Engagement
Call to Action Hooks
- Participate on draft spec - Review and contribute to RFC development
- Build POCs - Create proof-of-concept implementations
- Share your use cases - Help us understand real-world requirements
- Contribute to the project - Join development, documentation, or outreach efforts
Competitive Landscape
Currently, no other known groups are pushing in this direction. Our strategy:
- Encourage collaboration - Invite similar initiatives to join forces
- Open development - Transparent, community-driven specification process
- Broad participation - Welcome input from all stakeholders
Sustainability Model
Funding Sources
- Individual donations from community members
- Open-source grants from foundations and tech companies
- Partnership agreements with aligned organizations
- Revenue from optional certification and compliance services
Cost Management
- Volunteer-driven core development
- Minimal infrastructure costs (static site hosting, basic tooling)
- Community-contributed documentation and examples
- Grant funding for major development initiatives
Success Indicators
Short-term (6 months)
- RFC specification stabilized and published
- Reference implementations available and tested
- Active community of 50+ contributors and users
- Partnership agreements with 2+ major organizations
Medium-term (1 year)
- 100+ websites implementing Open Embeddings
- Integration with major CMS platforms
- Academic research partnerships established
- Standards body submission completed
Long-term (2+ years)
- Widespread adoption across the web
- Reduction in content re-embedding costs industry-wide
- Thriving ecosystem of tools and services
- Measurable impact on open internet accessibility
Get Involved
Ready to help shape the future of AI-native content discovery?
- Review our technical specification
- Try the examples and code samples
- Share your feedback and use cases
- Join our development community
Together, we can ensure the spice continues to flow freely across the open internet.