BigCode is a community project jointly led by Hugging Face and ServiceNow. Both organizations committed research, engineering, ethics, governance, and legal resources to ensure that the collaboration runs smoothly and makes progress towards the stated goals. ServiceNow Research and Hugging Face have made their respective compute clusters available for large-scale training of the BigCode models, and Hugging Face hosts the datasets, models, and related applications from the community to make it easy for everyone to access and use.
The BigCode project is governed by a steering committee jointly led by ServiceNow and Hugging Face, and is responsible for organizing and managing the project (including research strategy and publication goals), and provides oversight across all working groups.
Decisions that cannot be addressed at the community level are elevated to the lead of the Working Group for facilitated discussion, with further inputs and tie-breaker decision making by the Steering Committee as a last resort.
Governance for the project is open, meaning that the BigCode project encourages anyone from the community to join any working group or task force of interest, and for them to engage and contribute to work and decision making in the group.
Please see the Governance Card for more details.
BigCode is a research collaboration and is open to participants who:
- have a professional research background and
- are able to commit time to the project.
In general, participants are affiliated with a research organization (either in academia or industry) and work on the technical/ethical/legal aspects of LLMs for coding applications.
Community-invited guest subject matter experts are also encouraged to participate in relevant discussions where they are able to make an active contribution to the goals of the project.
We run the BigCode project through the following tools and platforms:
- We actively manage the project through a Github project board
- We use Slack for all internal communication (apply here to join!)
- We train models with a clone of Megatron-LM
- We host all code repositories on Github
- We host all model weights and datasets on HuggingFace
We are thankful for the support and contributions of the broader AI ecosystem, and would like to thank Toloka, for supporting BigCode with the use of their crowd platform and professional services in support of work in our PII task force.