This page provides access to and statistics about class-specific subsets of the Schema.org data contained in the December 2020 version of the Web Data Commons Microdata and JSON-LD corpus. The datasets are part of the Web Data Commons Schema.org Data Set Series
As many users are only interested in specific types of Schema.org data (like product data, event data, or address data), we have created class-specific subsets out of the complete Microdata and JSON-LD corpora for a selection of schema.org classes.
The subsets contain all instances of a specific class as well as all other data that is found on the webpages containing these instances. For example, a page containing data about a product might also contain reviews and offers for this product; a page containing data about an event might also contain data about the location of the event and the persons involved in the event. The data is represented in N-Quads format, meaning that the forth element of each quad contains the URL of the webpage from which the data was extracted.
Please note that
AdministrativeArea | Quads: 17,253,343 URLs: 171,224 Hosts: 565 | http://schema.org/AdministrativeArea (324,596)http://schema.org/City (320,027)https://schema.org/AdministrativeArea (246,534)http://schema.org/ImageObject (221,708)https://schema.org/Place (164,017) | 217.06 MB | schema_AdministrativeArea.gz (sample) |
---|---|---|---|---|
Airport | Quads: 9,243,641 URLs: 81,018 Hosts: 236 | http://schema.org/Airport (1,174,302)http://schema.org/Airline (131,691)http://schema.org/GeoCoordinates (108,773)https://schema.org/Airport (85,743)http://schema.org/Flight (67,699) | 89.00 MB | schema_Airport.gz (sample) |
Book | Quads: 312,641,237 URLs: 4,677,115 Hosts: 11,868 | http://schema.org/Book (23,912,305)http://schema.org/Person (7,040,572)http://schema.org/Offer (6,751,080)http://schema.org/ScholarlyArticle (4,055,334)http://schema.org/Organization (2,185,542) | 5.36 GB | schema_Book.gz (sample) |
City | Quads: 68,864,477 URLs: 525,983 Hosts: 3,082 | http://schema.org/City (1,839,852)http://schema.org/ListItem (1,439,955)http://schema.org/Person (1,385,309)http://schema.org/PostalAddress (1,345,343)http://schema.org/ImageObject (1,254,637) | 683.44 MB | schema_City.gz (sample) |
CollegeOrUniversity | Quads: 60,718,331 URLs: 556,309 Hosts: 938 | http://schema.org/CollegeOrUniversity (3,242,266)http://schema.org/Person (2,271,909)http://schema.org/Organization (1,836,464)http://schema.org/PostalAddress (1,244,090)http://schema.org/GeoCoordinates (1,028,364) | 658.72 MB | schema_CollegeOrUniversity.gz (sample) |
Continent | Quads: 2,213,109 URLs: 17,045 Hosts: 25 | http://schema.org/City (268,652)http://schema.org/AdministrativeArea (162,222)http://schema.org/Place (33,514)http://schema.org/Country (21,153)http://schema.org/Continent (17,078) | 19.41 MB | schema_Continent.gz (sample) |
Country | Quads: 96,108,835 URLs: 760,890 Hosts: 4,120 | http://schema.org/Country (6,966,900)http://schema.org/ListItem (2,771,026)http://schema.org/Offer (1,515,117)http://schema.org/Product (1,364,710)http://schema.org/ImageObject (959,514) | 982.87 MB | schema_Country.gz (sample) |
Course | Quads: 4,585,555 URLs: 93,819 Hosts: 749 | http://schema.org/Course (113,344)https://schema.org/Course (81,842)http://schema.org/PostalAddress (74,273)http://schema.org/ListItem (73,510)http://schema.org/Person (70,215) | 108.59 MB | schema_Course.gz (sample) |
CreativeWork | Quads: 2,614,612,185 URLs: 39,482,049 Hosts: 868,731 | https://schema.org/Person (87,466,609)https://schema.org/CreativeWork (70,031,071)https://schema.org/Comment (59,711,319)https://schema.org/SiteNavigationElement (35,953,959)https://schema.org/WPHeader (26,550,989) | 91.54 GB | schema_CreativeWork.gz (sample) |
Dataset | Quads: 72,591,326 URLs: 319,115 Hosts: 606 | http://schema.org/DataDownload (11,931,269)http://schema.org/Dataset (2,289,738)http://schema.org/Text (189,355)https://schema.org/Dataset (177,201)https://schema.org/DataDownload (146,351) | 437.33 MB | schema_Dataset.gz (sample) |
EducationalOrganization | Quads: 16,425,220 URLs: 390,075 Hosts: 2,438 | http://schema.org/EducationalOrganization (466,416)http://schema.org/PostalAddress (356,016)https://schema.org/EducationalOrganization (332,241)https://schema.org/PostalAddress (153,564)https://schema.org/ListItem (143,752) | 317.33 MB | schema_EducationalOrganization.gz (sample) |
Event | Quads: 425,784,402 URLs: 5,306,600 Hosts: 85,877 | http://schema.org/Event (30,853,002)http://schema.org/Place (17,421,101)http://schema.org/PostalAddress (8,535,335)https://schema.org/Event (2,913,324)https://schema.org/Place (2,903,029) | 6.12 GB | schema_Event.gz (sample) |
GeoCoordinates | Quads: 985,819,676 URLs: 9,424,186 Hosts: 94,612 | http://schema.org/PostalAddress (17,972,444)http://schema.org/GeoCoordinates (16,764,701)http://schema.org/ListItem (12,645,897)http://schema.org/ImageObject (9,181,211)http://schema.org/Place (8,793,770) | 12.51 GB | schema_GeoCoordinates.gz (sample) |
GovernmentOrganization | Quads: 7,103,171 URLs: 139,318 Hosts: 556 | http://schema.org/ListItem (507,796)http://schema.org/GovernmentOrganization (214,861)http://schema.org/PostalAddress (155,074)http://schema.org/Place (39,565)http://schema.org/Event (34,473) | 125.71 MB | schema_GovernmentOrganization.gz (sample) |
Hospital | Quads: 7,392,977 URLs: 161,843 Hosts: 539 | http://schema.org/Hospital (271,899)http://schema.org/PostalAddress (222,611)https://schema.org/MedicalCondition (80,952)http://schema.org/ListItem (67,828)http://schema.org/Organization (57,922) | 112.059 MB | schema_Hospital.gz (sample) |
Hotel | Quads: 154,954,065 URLs: 1,219,415 Hosts: 10,278 | https://schema.org/Hotel (2,587,637)https://schema.org/PostalAddress (2,460,856)https://schema.org/LocationFeatureSpecification (2,170,941)http://schema.org/ImageObject (2,022,791)http://schema.org/Hotel (1,986,250) | 1.75 GB | schema_Hotel.gz (sample) |
JobPosting | Quads: 36,125,795 URLs: 809,655 Hosts: 8,989 | http://schema.org/JobPosting (2,050,685)http://schema.org/Place (1,593,574)http://schema.org/PostalAddress (1,297,439)http://schema.org/Organization (1,168,906)http://schema.org/MonetaryAmount (197,612) | 1.15 GB | schema_JobPosting.gz (sample) |
LakeBodyOfWater | Quads: 272,018 URLs: 1,606 Hosts: 41 | https://schema.org/AdministrativeArea (13,941)https://schema.org/Place (12,196)https://schema.org/Map (4,271)https://schema.org/ListItem (3,724)https://schema.org/GeoCoordinates (2,334) | 3.90 MB | schema_LakeBodyOfWater.gz (sample) |
LandmarksOrHistoricalBuildings | Quads: 1,440,004 URLs: 16,041 Hosts: 199 | http://schema.org/LandmarksOrHistoricalBuildings (41,050)http://schema.org/PostalAddress (36,302)https://schema.org/Place (28,204)https://schema.org/AdministrativeArea (26,467)http://schema.org/CreativeWork (12,181) | 29.85 MB | schema_LandmarksOrHistoricalBuildings.gz (sample) |
Language | Quads: 78,668,466 URLs: 863,879 Hosts: 2,149 | http://schema.org/ListItem (4,100,326)http://schema.org/ImageObject (1,241,139)http://schema.org/Language (1,209,557)http://schema.org/Organization (1,070,075)http://schema.org/Place (676,903) | 982.37 MB | schema_Language.gz (sample) |
Library | Quads: 829,530 URLs: 16,759 Hosts: 233 | http://schema.org/Library (26,325)http://schema.org/PostalAddress (26,031)http://schema.org/Book (14,000)http://schema.org/Organization (12,915)http://schema.org/State (8,596) | 16.57 MB | schema_Library.gz (sample) |
LocalBusiness | Quads: 1,117,386,952 URLs: 25,111,808 Hosts: 319,706 | http://schema.org/LocalBusiness (45,152,336)http://schema.org/PostalAddress (35,403,200)http://schema.org/ImageObject (13,895,492)http://schema.org/ListItem (13,622,349)http://schema.org/Offer (7,863,940) | 16.71 GB | schema_LocalBusiness.gz (sample) |
Mountain | Quads: 938,476 URLs: 9,594 Hosts: 35 | http://schema.org/Mountain (43,595)http://schema.org/Review (34,378)https://schema.org/AdministrativeArea (22,777)https://schema.org/Place (20,366)http://schema.org/GeoCoordinates (11,921) | 13.01 MB | schema_Mountain.gz (sample) |
Movie | Quads: 143,290,784 URLs: 2,031,383 Hosts: 6,475 | http://schema.org/Person (7,457,293)http://schema.org/Movie (4,366,091)https://schema.org/Person (2,552,464)http://schema.org/Place (1,449,671)https://schema.org/Comment (1,339,495) | 2.48 GB | schema_Movie.gz (sample) |
Museum | Quads: 3,805,971 URLs: 52,126 Hosts: 249 | http://schema.org/PostalAddress (67,949)http://schema.org/GeoCoordinates (62,188)http://schema.org/Museum (61,974)https://schema.org/Place (53,698)https://schema.org/GeoCoordinates (50,693) | 48.15 MB | schema_Museum.gz (sample) |
MusicAlbum | Quads: 74,737,343 URLs: 568,141 Hosts: 3,316 | http://schema.org/MusicRecording (6,115,224)http://schema.org/MusicGroup (2,090,601)http://schema.org/MusicAlbum (1,736,556)http://schema.org/ListItem (834,005)https://schema.org/MusicGroup (555,642) | 760.75 MB | schema_MusicAlbum.gz (sample) |
MusicRecording | Quads: 128,371,375 URLs: 1,190,858 Hosts: 6,271 | http://schema.org/MusicRecording (11,081,124)http://schema.org/Country (5,269,280)http://schema.org/MusicGroup (2,597,500)https://schema.org/MusicRecording (1,608,591)http://schema.org/ListItem (1,063,460) | 1.28 GB | schema_MusicRecording.gz (sample) |
Organization | Quads: 9,720,580,404 URLs: 197,425,760 Hosts: 1,562,802 | https://schema.org/ImageObject (250,343,238)https://schema.org/Organization (174,621,397)http://schema.org/Organization (173,181,609)http://schema.org/ListItem (110,799,508)https://schema.org/WebPage (91,958,841) | 268.42 GB | schema_Organization.gz (sample) |
Painting | Quads: 14,092,166 URLs: 63,430 Hosts: 242 | http://schema.org/Person (5,255,462)http://schema.org/Painting (428,577)http://schema.org/Property (91,050)http://schema.org/CreativeWork (24,314)https://schema.org/Painting (18,980) | 98.12 MB | schema_Painting.gz (sample) |
Park | Quads: 1,339,369 URLs: 6,678 Hosts: 138 | http://schema.org/GeoCoordinates (66,417)https://schema.org/Place (39,290)https://schema.org/AdministrativeArea (37,016)http://schema.org/Park (31,822)http://schema.org/PostalAddress (15,001) | 14.95 MB | schema_Park.gz (sample) |
Person | Quads: 8,426,978,897 URLs: 144,484,136 Hosts: 1,245,426 | http://schema.org/Person (268,445,797)https://schema.org/Person (226,464,489)https://schema.org/ImageObject (170,194,109)https://schema.org/Organization (92,939,476)https://schema.org/WebPage (77,632,114) | 264.86 GB | schema_Person.gz (sample) |
Place | Quads: 1,056,779,340 URLs: 12,115,223 Hosts: 110,006 | http://schema.org/Place (41,474,812)http://schema.org/PostalAddress (24,971,298)http://schema.org/Event (19,153,162)http://schema.org/ListItem (9,969,006)http://schema.org/GeoCoordinates (8,122,674) | 14.33 GB | schema_Place.gz (sample) |
Product | Quads: 9,773,335,854 URLs: 182,567,594 Hosts: 1,212,831 | http://schema.org/Product (460,918,383)http://schema.org/Offer (390,312,604)http://schema.org/ListItem (178,103,842)https://schema.org/Product (108,264,711)https://schema.org/Offer (97,032,755) | 157.38 GB | schema_Product.gz (sample) |
QAPage | Quads: 94,516,847 URLs: 1,525,614 Hosts: 2,218 | http://schema.org/Person (5,801,866)http://schema.org/Answer (4,442,021)https://schema.org/Person (1,959,086)http://schema.org/QAPage (1,277,558)http://schema.org/Question (1,224,807) | 2.9 GB | schema_QAPage.gz (sample) |
RadioStation | Quads: 9,193,715 URLs: 190,551 Hosts: 281 | http://schema.org/ListItem (323,645)http://schema.org/NewsArticle (303,032)http://schema.org/RadioStation (212,671)http://schema.org/Organization (112,889)http://schema.org/ItemList (108,183) | 143.34 MB | schema_RadioStation.gz (sample) |
Recipe | Quads: 124,823,945 URLs: 2,197,375 Hosts: 19,424 | http://schema.org/Recipe (2,456,824)http://schema.org/Person (1,767,806)https://schema.org/Person (1,733,121)https://schema.org/Comment (1,584,269)http://schema.org/Comment (1,351,116) | 3.64 GB | schema_Recipe.gz (sample) |
Restaurant | Quads: 75,317,505 URLs: 606,199 Hosts: 22,488 | http://schema.org/Product (5,458,868)http://schema.org/Offer (2,287,129)http://schema.org/Restaurant (2,057,868)http://schema.org/Rating (1,461,565)http://schema.org/Review (1,447,898) | 820.00 MB | schema_Restaurant.gz (sample) |
RiverBodyOfWater | Quads: 50,543 URLs: 1,291 Hosts: 10 | https://schema.org/Canal (3,953)http://schema.org/PropertyValue (2,122)http://schema.org/RiverBodyOfWater (1,189)https://schema.org/Service (1,063)http://schema.org/GeoCoordinates (695) | 1.71 MB | schema_RiverBodyOfWater.gz (sample) |
School | Quads: 5,994,977 URLs: 119,542 Hosts: 438 | http://schema.org/School (167,680)http://schema.org/Person (98,356)http://schema.org/PostalAddress (94,565)http://schema.org/Organization (74,210)http://schema.org/QuantitativeValue (61,473) | 89.23 MB | schema_School.gz (sample) |
ShoppingCenter | Quads: 4,180,198 URLs: 62,212 Hosts: 323 | https://schema.org/ShoppingCenter (132,348)http://schema.org/ShoppingCenter (74,783)http://schema.org/ListItem (60,708)http://schema.org/Place (48,537)http://schema.org/GeoCoordinates (45,406) | 53.56 MB | schema_ShoppingCenter.gz (sample) |
SkiResort | Quads: 402,582 URLs: 5,814 Hosts: 53 | http://schema.org/ListItem (22,022)http://schema.org/SkiResort (6,631)http://schema.org/ImageObject (6,395)http://schema.org/PostalAddress (4,714)http://schema.org/Rating (4,354) | 7.11 MB | schema_SkiResort.gz (sample) |
SportsEvent | Quads: 96,984,930 URLs: 475,351 Hosts: 4,389 | http://schema.org/SportsTeam (5,667,006)http://schema.org/SportsEvent (4,510,098)http://schema.org/Place (3,050,294)http://schema.org/PostalAddress (2,401,422)http://schema.org/Offer (829,874) | 802.04 MB | schema_SportsEvent.gz (sample) |
SportsTeam | Quads: 94,118,140 URLs: 564,639 Hosts: 3,722 | http://schema.org/SportsTeam (6,823,136)http://schema.org/SportsEvent (2,878,410)http://schema.org/Place (2,112,485)http://schema.org/PostalAddress (1,773,868)http://schema.org/Person (1,059,886) | 795.39 MB | schema_SportsTeam.gz (sample) |
StadiumOrArena | Quads: 8,272,536 URLs: 23,997 Hosts: 115 | http://schema.org/SportsTeam (447,257)http://schema.org/SportsEvent (271,087)http://schema.org/StadiumOrArena (185,978)https://schema.org/StadiumOrArena (157,520)http://schema.org/GeoCoordinates (127,995) | 72.04 MB | schema_StadiumOrArena.gz (sample) |
TVEpisode | Quads: 36,133,608 URLs: 309,554 Hosts: 888 | http://schema.org/TVEpisode (2,078,621)https://schema.org/TVEpisode (1,108,278)http://schema.org/Person (651,955)http://schema.org/OnDemandEvent (473,954)http://schema.org/VideoObject (327,921) | 394.98 MB | schema_TVEpisode.gz (sample) |
TelevisionStation | Quads: 526,365 URLs: 4,404 Hosts: 25 | http://schema.org/CreativeWorkSeries (25,842)http://schema.org/Episode (23,928)http://schema.org/SiteNavigationElement (23,780)http://schema.org/ListItem (21,751)http://schema.org/WebPage (6,837) | 6.95 MB | schema_TelevisionStation.gz (sample) |
Class Name | Total Number of | Top Classes (Entity Count) | Total File Size | Quad File |
---|---|---|---|---|
AdministrativeArea | Quads: 33,240,345 URLs: 284,331 Hosts: 1,580 | http://schema.org/WPFooter (795,777)http://schema.org/AdministrativeArea (577,686)http://schema.org/ImageObject (563,080)http://schema.org/ListItem (486,729)http://schema.org/Organization (478,506) | 477.54 MB | schema_AdministrativeArea.gz (sample) |
Airport | Quads: 38,171,055 URLs: 101,701 Hosts: 504 | http://schema.org/Airport (2,825,308)http://schema.org/GeoCoordinates (1,732,815)http://schema.org/Flight (1,190,140)http://schema.org/Airline (923,701)http://schema.org/Offer (741,902) | 346.45 MB | schema_Airport.gz (sample) |
Book | Quads: 113,833,962 URLs: 2,187,378 Hosts: 4,807 | http://schema.org/Country (7,213,520)http://schema.org/Book (3,174,841)http://schema.org/Person (2,287,637)http://schema.org/Offer (1,725,345)http://schema.org/ListItem (1,682,115) | 1.86 GB | schema_Book.gz (sample) |
City | Quads: 92,919,665 URLs: 700,764 Hosts: 6,893 | http://schema.org/City (2,329,838)http://schema.org/ListItem (2,123,566)http://schema.org/Offer (1,251,020)http://schema.org/Organization (1,244,646)http://schema.org/Person (1,171,830) | 1.16 GB | schema_City.gz (sample) |
CollegeOrUniversity | Quads: 33,055,990 URLs: 457,835 Hosts: 2,359 | http://schema.org/CollegeOrUniversity (1,145,894)http://schema.org/Person (719,357)http://schema.org/PostalAddress (696,727)http://schema.org/ImageObject (643,722)http://schema.org/ListItem (462,089) | 807.43 MB | schema_CollegeOrUniversity.gz (sample) |
Continent | Quads: 219,819 URLs: 1,568 Hosts: 28 | http://schema.org/Rating (7,917)http://schema.org/Review (7,917)http://schema.org/Brand (6,859)http://schema.org/Continent (5,827)http://schema.org/Person (1,206) | 3.43 MB | schema_Continent.gz (sample) |
Country | Quads: 671,645,567 URLs: 7,076,645 Hosts: 41,478 | http://schema.org/Country (30,355,542)http://schema.org/ListItem (26,223,152)http://schema.org/Organization (17,541,358)http://schema.org/ContactPoint (13,356,222)http://schema.org/Offer (12,536,573) | 9.74 GB | schema_Country.gz (sample) |
Course | Quads: 75,694,406 URLs: 843,835 Hosts: 7,283 | http://schema.org/Course (2,571,706)http://schema.org/Organization (2,118,646)http://schema.org/ListItem (2,007,784)http://schema.org/EducationalOrganization (1,165,250)http://schema.org/Person (801,178) | 1.6 GB | schema_Course.gz (sample) |
CreativeWork | Quads: 386,382,089 URLs: 4,818,365 Hosts: 20,530 | http://schema.org/ImageObject (7,801,882)http://schema.org/CreativeWork (6,679,106)http://schema.org/Person (6,675,448)http://schema.org/CollectionPage (5,984,248)http://schema.org/Offer (4,974,802) | 6.13 GB | schema_CreativeWork.gz (sample) |
Dataset | Quads: 34,478,693 URLs: 688,967 Hosts: 924 | http://schema.org/Dataset (1,547,578)http://schema.org/Organization (1,213,528)http://schema.org/ImageObject (788,975)http://schema.org/Person (574,028)http://schema.org/PropertyValue (564,310) | 474.28 MB | schema_Dataset.gz (sample) |
EducationalOrganization | Quads: 55,637,601 URLs: 635,745 Hosts: 5,940 | http://schema.org/EducationalOrganization (2,143,339)http://schema.org/ListItem (1,885,596)http://schema.org/Course (1,026,005)http://schema.org/PostalAddress (991,311)http://schema.org/ImageObject (600,128) | 786.68 MB | schema_EducationalOrganization.gz (sample) |
Event | Quads: 996,707,005 URLs: 8,545,072 Hosts: 197,513 | http://schema.org/Event (37,963,935)http://schema.org/Place (29,870,864)http://schema.org/PostalAddress (25,156,958)http://schema.org/Person (13,617,077)http://schema.org/Offer (11,244,289) | 14.17 GB | schema_Event.gz (sample) |
GeoCoordinates | Quads: 2,243,402,571 URLs: 22,348,738 Hosts: 334,216 | http://schema.org/PostalAddress (44,145,435)http://schema.org/GeoCoordinates (42,551,463)http://schema.org/ListItem (29,698,769)http://schema.org/Offer (23,569,030)http://schema.org/OpeningHoursSpecification (23,395,597) | 30.14 GB | schema_GeoCoordinates.gz (sample) |
GovernmentOrganization | Quads: 14,545,111 URLs: 174,975 Hosts: 648 | http://schema.org/ImageObject (469,428)http://schema.org/PropertyValue (348,186)http://schema.org/GovernmentOrganization (308,513)http://schema.org/PostalAddress (303,890)http://schema.org/ListItem (168,417) | 181.73 MB | schema_GovernmentOrganization.gz (sample) |
Hospital | Quads: 12,784,970 URLs: 147,528 Hosts: 1,460 | http://schema.org/PostalAddress (323,641)http://schema.org/Hospital (274,087)http://schema.org/Physician (239,812)http://schema.org/ListItem (236,841)http://schema.org/Person (192,135) | 197.82 MB | schema_Hospital.gz (sample) |
Hotel | Quads: 200,094,788 URLs: 1,875,907 Hosts: 19,935 | http://schema.org/LocationFeatureSpecification (6,380,787)http://schema.org/Hotel (4,890,343)http://schema.org/ImageObject (3,975,391)http://schema.org/Rating (3,406,602)http://schema.org/PostalAddress (3,391,716) | 2.70 GB | schema_Hotel.gz (sample) |
JobPosting | Quads: 106,569,505 URLs: 2,743,698 Hosts: 29,263 | http://schema.org/Organization (3,626,768)http://schema.org/PostalAddress (3,507,018)http://schema.org/Place (3,377,839)http://schema.org/JobPosting (2,980,284)http://schema.org/MonetaryAmount (1,534,023) | 4.25 GB | schema_JobPosting.gz (sample) |
LakeBodyOfWater | Quads: 17,680 URLs: 506 Hosts: 30 | http://schema.org/GeoCoordinates (540)http://schema.org/LakeBodyOfWater (539)http://schema.org/PostalAddress (503)http://schema.org/ImageObject (464)http://schema.org/Country (302) | 0.84 MB | schema_LakeBodyOfWater.gz (sample) |
LandmarksOrHistoricalBuildings | Quads: 783,256 URLs: 20,448 Hosts: 363 | http://schema.org/PropertyValue (34,327)http://schema.org/ImageObject (34,083)http://schema.org/LandmarksOrHistoricalBuildings (22,392)http://schema.org/PostalAddress (19,697)http://schema.org/GeoCoordinates (14,519) | 17.66MB | schema_LandmarksOrHistoricalBuildings.gz (sample) |
Language | Quads: 620,578,470 URLs: 5,841,333 Hosts: 9,225 | http://schema.org/Person (27,885,884)http://schema.org/Comment (24,027,348)http://schema.org/ListItem (11,314,292)http://schema.org/Language (9,445,203)http://schema.org/InteractionCounter (8,692,934) | 12.17 GB | schema_Language.gz (sample) |
Library | Quads: 4,364,075 URLs: 104,735 Hosts: 371 | http://schema.org/OpeningHoursSpecification (236,334)http://schema.org/Library (136,756)http://schema.org/Person (101,013)http://schema.org/ListItem (48,870)http://schema.org/Book (40,652) | 68.45 MB | schema_Library.gz (sample) |
LocalBusiness | Quads: 1,026,427,389 URLs: 16,259,475 Hosts: 371,736 | http://schema.org/LocalBusiness (22,455,432)http://schema.org/PostalAddress (22,231,121)http://schema.org/ListItem (18,997,519)http://schema.org/ImageObject (10,509,422)http://schema.org/Organization (9,815,147) | 14.91 GB | schema_LocalBusiness.gz (sample) |
Mountain | Quads: 103,983 URLs: 4,980 Hosts: 21 | http://schema.org/propertyValue (12,892)http://schema.org/Mountain (5,216)http://schema.org/GeoCoordinates (5,216)http://schema.org/ImageObject (531)http://schema.org/ListItem (316) | 2.25 MB | schema_Mountain.gz (sample) |
Movie | Quads: 55,830,940 URLs: 794,515 Hosts: 2,433 | http://schema.org/Person (5,250,395)http://schema.org/Movie (1,228,216)http://schema.org/ImageObject (721,241)http://schema.org/ListItem (654,496)http://schema.org/Organization (518,152) | 854.94 MB | schema_Movie.gz (sample) |
Museum | Quads: 3,016,713 URLs: 46,481 Hosts: 611 | http://schema.org/OpeningHoursSpecification (256,386)http://schema.org/Museum (52,726)http://schema.org/PostalAddress (46,042)http://schema.org/ImageObject (45,100)http://schema.org/GeoCoordinates (29,529) | 42.98 MB | schema_Museum.gz (sample) |
MusicAlbum | Quads: 36,579,680 URLs: 492,364 Hosts: 12,880 | http://schema.org/MusicRecording (2,222,896)http://schema.org/Country (1,792,024)http://schema.org/Offer (1,120,711)http://schema.org/AudioObject (939,489)http://schema.org/MusicAlbum (891,811) | 398.30 MB | schema_MusicAlbum.gz (sample) |
MusicRecording | Quads: 50,359,781 URLs: 818,647 Hosts: 18,179 | http://schema.org/MusicRecording (3,496,168)http://schema.org/Country (1,460,465)http://schema.org/Offer (1,343,523)http://schema.org/AudioObject (1,188,937)http://schema.org/MusicGroup (927,833) | 556.69 MB | schema_MusicRecording.gz (sample) |
Organization | Quads: 22,248,891,127 URLs: 446,125,963 Hosts: 3,975,031 | http://schema.org/ImageObject (636,897,590)http://schema.org/Organization (620,673,800)http://schema.org/ListItem (543,674,922)http://schema.org/WebPage (431,896,404)http://schema.org/Person (304,224,561) | 366.59 GB | schema_Organization.gz (sample) |
Painting | Quads: 4,720,750 URLs: 56,250 Hosts: 226 | http://schema.org/Offer (305,287)http://schema.org/Organization (235,113)http://schema.org/Person (102,181)http://schema.org/ListItem (88,631)http://schema.org/Painting (62,182) | 57.88 MB | schema_Painting.gz (sample) |
Park | Quads: 581,370 URLs: 7,337 Hosts: 369 | http://schema.org/OpeningHoursSpecification (21,962)http://schema.org/Photograph (15,020)http://schema.org/PostalAddress (9,810)http://schema.org/BuyAction (8,350)http://schema.org/PriceSpecification (8,350) | 10.41 MB | schema_Park.gz (sample) |
Person | Quads: 14,636,143,722 URLs: 270,847,391 Hosts: 3,229,934 | http://schema.org/ImageObject (572,439,701)http://schema.org/Person (410,503,714)http://schema.org/WebPage (373,862,664)http://schema.org/ListItem (270,267,799)http://schema.org/Organization (255,900,289) | 254.78 MB | schema_Person.gz (sample) |
Place | Quads: 1,844,537,340 URLs: 18,440,873 Hosts: 259,600 | http://schema.org/Place (51,780,249)http://schema.org/PostalAddress (41,720,971)http://schema.org/Event (32,140,259)http://schema.org/ListItem (21,104,045)http://schema.org/Person (20,407,066) | 28.15 MB | schema_Place.gz (sample) |
Product | Quads: 8,131,413,189 URLs: 130,771,929 Hosts: 1,259,425 | http://schema.org/Offer (296,209,520)http://schema.org/Product (227,876,733)http://schema.org/ListItem (223,064,664)http://schema.org/Organization (162,049,459)http://schema.org/ImageObject (63,395,481) | 126.81 MB | schema_Product.gz (sample) |
QAPage | Quads: 61,571,766 URLs: 1,428,868 Hosts: 3,254 | http://schema.org/Person (4,150,729)http://schema.org/Answer (2,813,252)http://schema.org/Question (1,518,957)http://schema.org/QAPage (1,515,580)http://schema.org/ListItem (1,065,072) | 1.3 GB | schema_QAPage.gz (sample) |
RadioStation | Quads: 6,101,313 URLs: 125,062 Hosts: 292 | http://schema.org/ListItem (368,358)http://schema.org/ImageObject (147,620)http://schema.org/RadioStation (126,841)http://schema.org/BreadcrumbList (115,005)http://schema.org/PostalAddress (96,216) | 73.61 MB | schema_RadioStation.gz (sample) |
Recipe | Quads: 252,301,883 URLs: 2,805,592 Hosts: 25,435 | http://schema.org/HowToStep (9,409,141)http://schema.org/Person (4,443,785)http://schema.org/ListItem (4,143,012)http://schema.org/ImageObject (3,936,720)http://schema.org/Recipe (3,552,034) | 4.40 GB | schema_Recipe.gz (sample) |
Restaurant | Quads: 195,140,801 URLs: 1,265,863 Hosts: 26,995 | http://schema.org/MenuItem (12,034,027)http://schema.org/Offer (10,484,207)http://schema.org/Restaurant (2,013,066)http://schema.org/Product (1,876,240)http://schema.org/Review (1,813,195) | 2.04 GB | schema_Restaurant.gz (sample) |
RiverBodyOfWater | Quads: 68,511 URLs: 981 Hosts: 15 | http://schema.org/ImageObject (2,979)http://schema.org/ListItem (2,870)http://schema.org/Organization (2,029)http://schema.org/RiverBodyOfWater (986)http://schema.org/WebPage (969) | 1.68 MB | schema_RiverBodyOfWater.gz (sample) |
School | Quads: 8,102,405 URLs: 179,059 Hosts: 1,279 | http://schema.org/ListItem (240,479)http://schema.org/School (230,738)http://schema.org/PostalAddress (213,249)http://schema.org/ImageObject (118,726)http://schema.org/WebSite (73,281) | 117.09 MB | schema_School.gz (sample) |
ShoppingCenter | Quads: 9,169,144 URLs: 118,691 Hosts: 1,030 | http://schema.org/Organization (233,552)http://schema.org/PostalAddress (225,105)http://schema.org/ShoppingCenter (172,396)http://schema.org/Offer (163,741)http://schema.org/ListItem (102,303) | 121.56 MB | schema_ShoppingCenter.gz (sample) |
SkiResort | Quads: 522,511 URLs: 20,552 Hosts: 146 | http://schema.org/SkiResort (23,975)http://schema.org/PostalAddress (20,987)http://schema.org/AggregateRating (17,563)http://schema.org/Review (13,134)http://schema.org/Person (13,009) | 13.49 MB | schema_SkiResort.gz (sample) |
SportsEvent | Quads: 36,007,707 URLs: 311,105 Hosts: 1,761 | http://schema.org/SportsEvent (973,237)http://schema.org/SportsTeam (963,071)http://schema.org/Place (898,725)http://schema.org/PostalAddress (723,037)http://schema.org/Organization (634,795) | 401.80 MB | schema_SportsEvent.gz (sample) |
SportsTeam | Quads: 32,491,014 URLs: 289,247 Hosts: 871 | http://schema.org/SportsTeam (1,132,395)http://schema.org/Place (779,476)http://schema.org/Person (736,516)http://schema.org/ImageObject (628,839)http://schema.org/Organization (609,024) | 354.59 MB | schema_SportsTeam.gz (sample) |
StadiumOrArena | Quads: 9,161,936 URLs: 60,663 Hosts: 127 | http://schema.org/Place (352,797)http://schema.org/Organization (319,023)http://schema.org/ImageObject (294,663)http://schema.org/SportsTeam (127,429)http://schema.org/PostalAddress (113,170) | 89.99 MB | schema_StadiumOrArena.gz (sample) |
TVEpisode | Quads: 44,182,146 URLs: 312,342 Hosts: 496 | http://schema.org/Country (5,833,535)http://schema.org/TVEpisode (1,569,706)http://schema.org/Person (1,236,570)http://schema.org/ListItem (347,549)http://schema.org/TVSeries (333,266) | 520.33 MB | schema_TVEpisode.gz (sample) |
TelevisionStation | Quads: 1,892,975 URLs: 26,031 Hosts: 132 | http://schema.org/ImageObject (72,586)http://schema.org/TelevisionStation (31,050)http://schema.org/ListItem (29,153)http://schema.org/Organization (26,497)http://schema.org/Person (25,092) | 31.30 MB | schema_TelevisionStation.gz (sample) |
In case you are interested in a particular class or set of classes which is not listed above, please get in contact with the WebDataCommons team via Mailing List or our Google Group.
The jupyter notebooks used to create the schema.org subsets from the MD and JSON-LD corpus can be checked out from our Github repository.
The source code can be checked out from our Github repository. For more information about the framework and a detailed description how to run a own extraction visit the framework page.
Please send questions and feedback to the Web Data Commons mailing list or post them in our Web Data Commons Google Group.