Categories: Technology

Constellation Network and Common Crawl Provide Secure Validation of AI Training Data

SAN FRANCISCO, Dec. 19, 2024 (GLOBE NEWSWIRE) — Constellation Network, a Web3 ecosystem validated by the US Department of Defense, today announced the launch of a customized blockchain developed in partnership with the Common Crawl Foundation, to create the industry’s first cryptographically secure, immutable archive of internet data for AI training and development.

The collaboration introduces a new approach to validating and securely accessing 17 years of internet crawl data—spanning nearly  9 petabytes which 80% of Large Language Models (LLMs) use to train AI—through an immutable, cryptographically secured blockchain network built on Constellation. This innovative application-specific network, or Metagraph, addresses pressing concerns in AI development while exploring vast new use cases for blockchain technology in emerging industries: data provenance, privacy, and ethical sourcing. Furthermore, the network will utilize Constellation’s DAG utility asset to secure the archived internet crawls. This represents a significant advancement in utilizing cryptocurrency as a mechanism for businesses to notarize data, shifting the focus from consumer costs or gas fees typical of many other layer-one networks to an operational expense.

Key Technological Innovations

  • Comprehensive Data Archiving: A fully immutable copy of internet history, providing unprecedented transparency and traceability for AI training datasets
  • End-to-End Encryption: Cryptographic security that ensures data integrity throughout the AI development lifecycle
  • Ethical AI Framework: A robust solution for addressing concerns around data collection, storage, and usage in large language models

“This integration is a critical step forward in securing the future of AI development,” said Alex Brandes, CTO of Constellation Network. “By ensuring cryptographic integrity and immutability of training data, we are addressing one of the most pressing challenges in the field today: trustworthiness and provenance of datasets. We believe our platform will grow to become a cornerstone in the field of responsible AI development, setting new standards for data integrity and trust.” 

Industry Applications

The blockchain-enabled data archive is already attracting attention from advanced AI research initiatives. TraceAI, a project developed through the National Science Foundation (NSF) and SBIR program, is in testing stages in the development of their own application-specific network, built on Constellation, to add immutability, auditability, and proof of authorship to its training models and to develop advanced watermarking technologies. TraceAI will also leverage  Common Crawl’s Constellation-built solution to further extend their work in blockchain encrypted AI to include tracking the source origin of data.

Kevin Jackson, Vice President of Space Domain Communications & Commercialization for Forward EdgeAI, emphasizes the significance of this breakthrough: “This represents the natural evolution of AI and machine learning model development—transforming data management from a technical challenge to a trusted business tool that drives global standardization and verification.”

Looking Forward

Over the coming months, Constellation Network and Common Crawl Foundation will work together to expand on solution sets for AI developers and further integrate the distribution of the cryptographically validated access to the crawl as part of the standard release process.   

“For users of the Crawl who are concerned about the provenance of the data, especially those using it for AI models, Constellation and their hypergraph blockchain provides an elegant solution”, said Rich Skrenta, Executive Director of the Common Crawl, “we are looking forward to adding the ability to securely validate the crawl as part of our standard distribution by partnering with Constellation”.

Evidence of this integration can be found on Constellation’s transaction viewer, called the “DAG explorer,” and developers can get started using verified historical crawls for AI applications. Please follow along for further solutions to be developed by Constellation, Forward Edge-AI, and Common Crawl. 

About Constellation Network Constellation is a leading blockchain network advancing innovation through on-chain data security, partnering with critical global stakeholders, including the U.S. Department of Defense, to deliver transformative, next-generation technologies.

About Common Crawl Foundation The Common Crawl Foundation is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to the public, free of charge. Their web archive consists of petabytes of data collected over years of web crawling, serving as a critical resource for researchers, businesses, and developers worldwide.

About Forward Edge-AI Forward Edge-AI is at the forefront of a revolution in responsible and inclusive Artificial Intelligence (AI) for the betterment of humanity. Since its foundation in 2019, our goal is to become the dominant player in Artificial Intelligence and lead the revolution in augmenting edge technology with human intelligence.

About Common Crawl Foundation

Contact

Email: press@constellationnetwork.io 

Website: https://constellationnetwork.io/ 

Twitter: https://x.com/conste11ation 

GitHub: https://github.com/Constellation-Labs/tessellation

DAG Explorer: https://mainnet.dagexplorer.io/

Contact

Dagnum PI

dagnum@stardust-collective.org

A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/aed92cb9-444b-4b50-9a0b-5ee1889af9ea

GlobeNews Wire

Recent Posts

Lifetime Achievement Award Posthumously Bestowed Upon Dr. Vishakha Tripathi Ji for Her Outstanding Service to Humanity

NEW DELHI, Dec. 21, 2024 /PRNewswire/ -- Jagadguru Kripalu Parishat (JKP) announces that the Asia…

38 minutes ago

11TH ANNUAL WINTERNATIONAL EMBASSY SHOWCASE DRAWS 8,000 GUESTS

68 Embassies Displayed Their Culture and Traditions at the Ronald Reagan Building and International Trade…

38 minutes ago

NYSE CONTENT ADVISORY: PRE MARKET UPDATE AND WEEKLY RECAP DECEMBER 20, 2024

NEW YORK, Dec. 20, 2024 /PRNewswire/ -- The New York Stock Exchange (NYSE) is proud…

38 minutes ago

British education group GEDU adds ICN, a Triple Crown institution to its portfolio

LONDON, Dec. 20, 2024 /PRNewswire/ -- Global Education (GEDU) has partnered with triple accredited business…

38 minutes ago

CGTN: Macao SAR embarks on new chapter of ‘One Country, Two Systems’

BEIJING, Dec. 20, 2024 /PRNewswire/ -- With cheerful vibes in the air, China's Macao Special…

38 minutes ago

Girls in Madagascar Learn Languages with Support from FunEasyLearn and Time + Tide Foundation

December 22, 2024 09:54 ET  | Source: FunEasyLearn CHISINAU, Republic of Moldova, Dec. 22, 2024…

2 hours ago