Class-Specific Subsets of the Schema.org Data contained in the November 2018 Corpus

This page provides access to and statistics about class-specific subsets of the Schema.org data contained in the November 2018 version of the Web Data Commons Microdata corpus.

Introduction

As many users are only interested in specific types of Schema.org data (like product data, event data, or address data), we have created class-specific subsets out of the complete Microdata and JSON-LD corpora for a selection of schema.org classes. The subsets contain all instances of a specific class as well as all other data that is found on the webpages containing these instances. For example, a page containing data about a product might also contain reviews and offers for this product; a page containing data about an event might also contain data about the location of the event and the persons involved in the event. The data is represented in N-Quads format, meaning that the forth element of each quad contains the URL of the webpage from which the data was extracted.

Please note that

You are welcome to use the datasets and also to tell about your findings. If you find our datasets useful for your research, please cite the paper: The WebDataCommons Microdata, RDFa and Microformat Dataset Series by Robert Meusel, Petar Petrovski, and Christian Bizer in the Proceedings of the 13th International Semantic Web Conference: Replication, Benchmark, Data and Software Track (ISWC2014).

Class-Specific Subsets of the Schema.org Data - Microdata Corpus

Class NameTotal Number ofTop Classes (Entity Count)Total File SizeQuad File
http://schema.org/AdministrativeArea Quads: 19,939,512
URLs: 83,930
Hosts: 364
http://schema.org/City (349,105)
http://schema.org/ImageObject (345,521)
http://schema.org/AdministrativeArea (328,078)
https://schema.org/GeoCoordinates (262,503)
https://schema.org/GeoShape (243,256)
274.5 MBschema_AdministrativeArea.gz (sample)
http://schema.org/Airport Quads: 11,334,986
URLs: 111,239
Hosts: 209
http://schema.org/Airport (1,145,323)
http://schema.org/Offer (367,586)
http://www.schema.org/Product (358,324)
http://schema.org/Airline (321,377)
http://schema.org/Flight (266,135)
150.1 MBschema_Airport.gz (sample)
http://schema.org/Book Quads: 220,052,622
URLs: 4,752,518
Hosts: 9,905
http://schema.org/Book (11,468,730)
http://schema.org/Offer (8,046,611)
http://schema.org/Person (6,253,925)
http://schema.org/ListItem (1,693,034)
http://schema.org/AggregateRating (1,686,439)
5.3 GBschema_Book.gz (sample)
http://schema.org/City Quads: 37,276,145
URLs: 223,836
Hosts: 810
http://schema.org/City (3,221,928)
http://schema.org/PostalAddress (886,688)
http://schema.org/GeoCoordinates (806,262)
http://schema.org/Person (575,889)
http://schema.org/ImageObject (485,994)
569.8 MBschema_City.gz (sample)
http://schema.org/CollegeOrUniversity Quads: 51,175,010
URLs: 362,195
Hosts: 678
http://schema.org/Organization (3,689,521)
http://schema.org/CollegeOrUniversity (2,392,797)
http://schema.org/Person (1,779,436)
http://schema.org/PostalAddress (823,205)
http://schema.org/LocalBusiness (622,116)
824.9 MBschema_CollegeOrUniversity.gz (sample)
http://schema.org/Continent Quads: 2,198,788
URLs: 8,016
Hosts: 17
http://schema.org/City (298,536)
http://schema.org/AdministrativeArea (175,391)
http://schema.org/Place (28,904)
http://schema.org/Country (10,303)
http://schema.org/Continent (8,155)
25.7 MBschema_Continent.gz (sample)
http://schema.org/Country Quads: 37,175,866
URLs: 208,345
Hosts: 1,448
http://schema.org/Country (3,234,330)
http://schema.org/City (2,145,294)
http://schema.org/Person (565,958)
http://schema.org/MusicRecording (335,734)
http://schema.org/PostalAddress (298,819)
566.6 MBschema_Country.gz (sample)
http://schema.org/CreativeWork Quads: 450,017,634
URLs: 7,683,627
Hosts: 162,032
http://schema.org/CreativeWork (17,761,346)
http://schema.org/Person (6,559,518)
http://schema.org/ImageObject (6,134,190)
http://schema.org/SiteNavigationElement (5,742,194)
http://schema.org/WPHeader (3,434,330)
16 GBschema_CreativeWork.gz (sample)
http://schema.org/EducationalOrganization Quads: 8,934,588
URLs: 282,893
Hosts: 1,653
http://schema.org/EducationalOrganization (462,186)
http://schema.org/PostalAddress (248,977)
http://schema.org/ListItem (134,765)
http://schema.org/Organization (99,219)
http://schema.org/Place (80,415)
389.9 MBschema_EducationalOrganization.gz (sample)
http://schema.org/Event Quads: 365,926,884
URLs: 4,706,682
Hosts: 88,130
http://schema.org/Event (26,035,295)
http://schema.org/Place (11,832,898)
http://schema.org/PostalAddress (7,368,392)
http://schema.org/ListItem (1,601,186)
http://schema.org/Person (1,470,290)
7.3 GBschema_Event.gz (sample)
http://schema.org/GeoCoordinates Quads: 534,999,580
URLs: 6,032,791
Hosts: 57,127
http://schema.org/GeoCoordinates (16,285,080)
http://schema.org/PostalAddress (14,257,184)
http://schema.org/Place (6,709,261)
http://schema.org/ListItem (5,722,183)
http://schema.org/LocalBusiness (5,646,490)
9.5 GBschema_GeoCoordinates.gz (sample)
http://schema.org/GovernmentOrganization Quads: 7,477,525
URLs: 90,640
Hosts: 346
http://schema.org/PostalAddress (535,307)
http://schema.org/LocalBusiness (319,710)
http://schema.org/GovernmentOrganization (149,704)
http://data-vocabulary.org/Breadcrumb (46,317)
http://schema.org/ListItem (41,948)
140.2 MBschema_GovernmentOrganization.gz (sample)
http://schema.org/Hospital Quads: 4,028,324
URLs: 70,878
Hosts: 388
http://schema.org/PostalAddress (257,756)
http://schema.org/Hospital (100,192)
http://schema.org/LocalBusiness (100,075)
http://schema.org/Place (45,807)
http://schema.org/MedicalCondition (44,980)
69.4 MBschema_Hospital.gz (sample)
http://schema.org/Hotel Quads: 78,659,846
URLs: 1,030,877
Hosts: 8,175
http://schema.org/Hotel (4,321,058)
http://schema.org/PostalAddress (2,205,631)
http://schema.org/Rating (2,182,926)
http://schema.org/Review (1,736,807)
http://schema.org/ListItem (919,158)
1.7 GBschema_Hotel.gz (sample)
http://schema.org/JobPosting Quads: 69,754,384
URLs: 949,658
Hosts: 7,581
http://schema.org/JobPosting (5,078,941)
http://schema.org/Place (3,833,700)
http://schema.org/PostalAddress (3,338,185)
http://schema.org/Organization (2,158,892)
http://schema.org/MonetaryAmount (466,792)
2.1 GBschema_JobPosting.gz (sample)
http://schema.org/LakeBodyOfWater Quads: 4,557
URLs: 48
Hosts: 17
http://schema.org/PostalAddress (134)
http://schema.org/LakeBodyOfWater (92)
http://schema.org/ImageObject (42)
http://schema.org/Museum (41)
http://schema.org/Park (35)
136.4 kBschema_LakeBodyOfWater.gz (sample)
http://schema.org/LandmarksOrHistoricalBuildings Quads: 1,190,729
URLs: 15,801
Hosts: 163
http://schema.org/LandmarksOrHistoricalBuildings (70,276)
http://schema.org/GeoCoordinates (54,397)
http://schema.org/PostalAddress (19,213)
http://schema.org/ListItem (17,288)
http://schema.org/TouristAttraction (9,809)
24.6 MBschema_LandmarksOrHistoricalBuildings.gz (sample)
http://schema.org/Language Quads: 11,037,862
URLs: 121,472
Hosts: 669
http://schema.org/SiteNavigationElement (234,022)
http://schema.org/Language (196,776)
http://schema.org/GeoCoordinates (177,807)
http://schema.org/PostalAddress (171,945)
http://schema.org/City (94,574)
217.6 MBschema_Language.gz (sample)
http://schema.org/Library Quads: 545,264
URLs: 13,026
Hosts: 196
http://schema.org/Library (19,524)
http://schema.org/PostalAddress (14,137)
http://schema.org/Book (7,306)
http://schema.org/WebPage (6,090)
http://schema.org/Offer (5,655)
9.7 MBschema_Library.gz (sample)
http://schema.org/LocalBusiness Quads: 764,272,779
URLs: 17,089,614
Hosts: 294,390
http://schema.org/LocalBusiness (39,877,532)
http://schema.org/PostalAddress (29,628,355)
http://schema.org/ImageObject (8,461,328)
http://schema.org/AggregateRating (6,849,979)
http://schema.org/ListItem (6,422,465)
14.9 GBschema_LocalBusiness.gz (sample)
http://schema.org/Mountain Quads: 536,138
URLs: 13,373
Hosts: 25
http://schema.org/Mountain (30,806)
http://schema.org/Review (17,346)
http://schema.org/GeoCoordinates (15,951)
http://schema.org/Place (1,957)
http://schema.org/ListItem (564)
7.9 MBschema_Mountain.gz (sample)
http://schema.org/Movie Quads: 124,024,978
URLs: 2,139,240
Hosts: 8,845
http://schema.org/Person (6,541,469)
http://schema.org/Movie (5,233,520)
http://schema.org/AggregateRating (1,175,324)
http://data-vocabulary.org/Breadcrumb (818,056)
http://schema.org/Place (780,501)
2.7 GBschema_Movie.gz (sample)
http://schema.org/Museum Quads: 874,601
URLs: 18,801
Hosts: 195
http://schema.org/Person (50,813)
http://schema.org/Museum (24,242)
http://schema.org/PostalAddress (21,073)
http://schema.org/CreativeWork (15,806)
http://schema.org/GeoCoordinates (15,371)
16.1 MBschema_Museum.gz (sample)
http://schema.org/MusicAlbum Quads: 41,662,680
URLs: 589,277
Hosts: 9,834
http://schema.org/MusicRecording (2,111,701)
http://schema.org/Country (2,096,835)
http://schema.org/MusicAlbum (1,594,941)
http://schema.org/MusicGroup (834,812)
http://schema.org/MusicPlaylist (574,829)
671 MBschema_MusicAlbum.gz (sample)
http://schema.org/MusicRecording Quads: 86,572,548
URLs: 1,288,075
Hosts: 6,748
http://schema.org/MusicRecording (8,471,381)
http://schema.org/Country (2,686,209)
http://schema.org/MusicGroup (1,882,112)
http://schema.org/ListItem (823,549)
http://schema.org/MusicAlbum (676,333)
1.4 GBschema_MusicRecording.gz (sample)
http://schema.org/Organization Quads: 2,786,730,213
URLs: 84,854,146
Hosts: 510,068
http://schema.org/Organization (100,639,273)
http://schema.org/Product (42,256,684)
http://schema.org/Offer (37,405,972)
http://schema.org/ListItem (33,971,659)
http://schema.org/Person (26,597,048)
77.3 GBschema_Organization.gz
schemaMD_Organization_chunks.list
(sample)
http://schema.org/Painting Quads: 2,325,791
URLs: 63,861
Hosts: 265
http://schema.org/Painting (177,595)
http://schema.org/UserComments (117,300)
http://schema.org/Person (46,436)
http://schema.org/CreativeWork (19,962)
http://schema.org/WPAdBlock (14,278)
59.3 MBschema_Painting.gz (sample)
http://schema.org/Park Quads: 299,717
URLs: 3,935
Hosts: 84
http://schema.org/PostalAddress (11,108)
http://schema.org/Park (6,537)
http://schema.org/GeoCoordinates (5,272)
http://schema.org/LocalBusiness (3,380)
http://schema.org/Organization (3,031)
5.2 MBschema_Park.gz (sample)
http://schema.org/Person Quads: 2,059,342,516
URLs: 75,842,523
Hosts: 324,348
http://schema.org/Person (126,178,478)
http://schema.org/ImageObject (25,221,509)
http://schema.org/Article (17,711,737)
http://schema.org/Organization (17,091,343)
https://schema.org/ImageObject (17,091,343)
67.5 GBschema_Person.gz
schemaMD_Person_chunks.list
(sample)
http://schema.org/Place Quads: 650,691,458
URLs: 7,457,552
Hosts: 92,127
http://schema.org/Place (31,781,150)
http://schema.org/PostalAddress (19,082,808)
http://schema.org/Event (14,385,590)
http://schema.org/GeoCoordinates (5,495,809)
http://schema.org/Organization (5,065,026)
13.5 GBschema_Place.gz (sample)
http://schema.org/Product Quads: 4,846,852,072
URLs: 267,192,453
Hosts: 812,204
http://schema.org/Product (307,301,434)
http://schema.org/Offer (236,373,457)
http://schema.org/ListItem (65,797,386)
http://data-vocabulary.org/Breadcrumb (45,538,335)
http://schema.org/AggregateRating (30,460,535)
112.7 GBschema_Product.gz
schemaMD_Product_chunks.list
(sample)
http://schema.org/RadioStation Quads: 9,286,479
URLs: 61,779
Hosts: 83
http://schema.org/SiteNavigationElement (1,575,045)
http://schema.org/RadioStation (101,779)
http://schema.org/PostalAddress (91,154)
http://schema.org/LocalBusiness (58,735)
http://schema.org/WebPage (30,568)
144.1 MBschema_RadioStation.gz (sample)
http://schema.org/Recipe Quads: 80,676,015
URLs: 1,530,636
Hosts: 19,560
http://schema.org/AggregateRating (2,468,904)
http://schema.org/Recipe (2,036,417)
http://schema.org/Person (601,364)
http://schema.org/ListItem (596,527)
https://schema.org/Person (556,945)
2.3 GBschema_Recipe.gz (sample)
http://schema.org/Restaurant Quads: 47,471,311
URLs: 419,405
Hosts: 10,854
http://schema.org/PostalAddress (1,651,869)
http://schema.org/Restaurant (1,480,443)
http://schema.org/Product (1,020,678)
http://schema.org/Review (858,322)
http://schema.org/Rating (713,048)
786 MBschema_Restaurant.gz (sample)
http://schema.org/RiverBodyOfWater Quads: 6,427
URLs: 83
Hosts: 9
http://schema.org/AggregateRating (453)
http://schema.org/NewsArticle (395)
http://schema.org/Place (160)
http://schema.org/RiverBodyOfWater (140)
http://schema.org/Park (81)
90.4 kBschema_RiverBodyOfWater.gz (sample)
http://schema.org/School Quads: 5,963,114
URLs: 54,780
Hosts: 368
http://schema.org/PostalAddress (366,779)
http://schema.org/LocalBusiness (227,798)
http://schema.org/School (125,602)
http://schema.org/Thing (42,934)
http://schema.org/QuantitativeValue (40,163)
88.5 MBschema_School.gz (sample)
http://schema.org/ShoppingCenter Quads: 4,350,065
URLs: 43,312
Hosts: 286
http://schema.org/Product (202,479)
http://schema.org/PostalAddress (189,138)
http://schema.org/ShoppingCenter (136,122)
http://schema.org/ClothingStore (111,077)
http://schema.org/ListItem (62,798)
65.9 MBschema_ShoppingCenter.gz (sample)
http://schema.org/SkiResort Quads: 118,412
URLs: 10,334
Hosts: 35
http://schema.org/SkiResort (10,890)
http://schema.org/AggregateRating (8,324)
http://schema.org/Review (2,038)
http://schema.org/Person (1,477)
http://data-vocabulary.org/Breadcrumb (1,036)
3.2 MBschema_SkiResort.gz (sample)
http://schema.org/SportsEvent Quads: 41,341,455
URLs: 286,572
Hosts: 3,473
http://schema.org/SportsEvent (2,537,089)
http://schema.org/SportsTeam (2,504,561)
http://schema.org/Place (871,249)
http://schema.org/PostalAddress (586,367)
http://schema.org/Person (420,088)
629.2 MBschema_SportsEvent.gz (sample)
http://schema.org/SportsTeam Quads: 44,783,163
URLs: 314,711
Hosts: 2,440
http://schema.org/SportsTeam (3,767,315)
http://schema.org/SportsEvent (1,377,926)
http://schema.org/SportsMatchCompetitor (1,149,954)
http://schema.org/Person (1,084,125)
http://schema.org/SportsMatch (574,977)
713 MBschema_SportsTeam.gz (sample)
http://schema.org/StadiumOrArena Quads: 3,772,099
URLs: 26,410
Hosts: 86
http://schema.org/Person (271,323)
http://schema.org/SportsTeam (145,260)
http://schema.org/StadiumOrArena (121,116)
http://schema.org/PostalAddress (64,411)
http://data-vocabulary.org/Breadcrumb (51,180)
54.8 MBschema_StadiumOrArena.gz (sample)
http://schema.org/TVEpisode Quads: 34,615,542
URLs: 590,797
Hosts: 761
http://schema.org/TVEpisode (2,174,632)
http://schema.org/TVSeries (636,853)
http://data-vocabulary.org/Breadcrumb (541,048)
http://schema.org/Person (538,407)
http://schema.org/AggregateRating (449,537)
577.8 MBschema_TVEpisode.gz (sample)
http://schema.org/TelevisionStation Quads: 321,060
URLs: 5,495
Hosts: 30
http://schema.org/PostalAddress (22,108)
http://schema.org/LocalBusiness (14,586)
http://schema.org/TelevisionStation (10,437)
http://data-vocabulary.org/Breadcrumb (2,215)
http://schema.org/Physician (1,008)
4 MBschema_TelevisionStation.gz (sample)


Class-Specific Subsets of the Schema.org Data - JSON-LD Corpus

Class NameTotal Number ofTop Classes (Entity Count)Total File SizeQuad File
http://schema.org/AdministrativeArea Quads: 1,173,357
URLs: 21,093
Hosts: 174
http://schema.org/Organization (37,033)
http://schema.org/ImageObject (28,830)
http://schema.org/AdministrativeArea (27,016)
http://schema.org/WebSite (16,391)
http://schema.org/Person (14,531)
15.9 MBschema_AdministrativeArea.gz (sample)
http://schema.org/Airport Quads: 707,850
URLs: 2,875
Hosts: 49
http://schema.org/Airport (58,302)
http://schema.org/GeoCoordinates (44,143)
http://schema.org/Flight (13,384)
http://schema.org/Airline (13,188)
http://schema.org/Offer (11,993)
7.1 MBschema_Airport.gz (sample)
http://schema.org/Book Quads: 23,305,025
URLs: 601,948
Hosts: 682
http://schema.org/Person (727,848)
http://schema.org/Book (680,422)
http://schema.org/Offer (469,393)
http://schema.org/Organization (440,298)
http://schema.org/Library (398,109)
455.4 MBschema_Book.gz (sample)
http://schema.org/City Quads: 25,057,673
URLs: 147,849
Hosts: 601
http://schema.org/GeoCoordinates (853,221)
http://schema.org/ListItem (829,751)
http://schema.org/PostalAddress (638,107)
http://schema.org/Question (588,761)
http://schema.org/Answer (588,761)
318.7 MBschema_City.gz (sample)
http://schema.org/CollegeOrUniversity Quads: 4,716,079
URLs: 59,872
Hosts: 774
http://schema.org/ImageObject (164,964)
http://schema.org/CollegeOrUniversity (121,181)
http://schema.org/PostalAddress (99,915)
http://schema.org/ListItem (79,426)
http://schema.org/CreativeWork (78,942)
60.9 MBschema_CollegeOrUniversity.gz (sample)
http://schema.org/Continent Quads: 2,052
URLs: 40
Hosts: 4
http://schema.org/Continent (146)
http://schema.org/ImageObject (64)
http://schema.org/Service (42)
http://schema.org/Brand (22)
http://schema.org/Organization (21)
17.4 kBschema_Continent.gz (sample)
http://schema.org/Country Quads: 121,567,540
URLs: 601,169
Hosts: 939
http://schema.org/Country (30,645,493)
http://schema.org/EntryPoint (873,153)
http://schema.org/MusicRecording (857,355)
http://schema.org/ListItem (581,091)
http://schema.org/Offer (458,434)
830.6 MBschema_Country.gz (sample)
http://schema.org/CreativeWork Quads: 44,831,982
URLs: 659,710
Hosts: 1,370
http://schema.org/ListItem (1,692,793)
http://schema.org/CreativeWork (1,223,299)
http://schema.org/Person (1,075,558)
http://schema.org/ImageObject (659,368)
http://schema.org/GeoCoordinates (536,545)
712.1 MBschema_CreativeWork.gz (sample)
http://schema.org/EducationalOrganization Quads: 1,574,213
URLs: 42,860
Hosts: 587
http://schema.org/EducationalOrganization (52,395)
http://schema.org/PostalAddress (42,614)
http://schema.org/ListItem (19,914)
http://schema.org/ContactPoint (16,128)
http://schema.org/ImageObject (14,775)
23.3 MBschema_EducationalOrganization.gz (sample)
http://schema.org/Event Quads: 64,536,165
URLs: 1,322,588
Hosts: 63,599
http://schema.org/Event (2,659,979)
http://schema.org/Place (2,016,053)
http://schema.org/PostalAddress (1,998,725)
http://schema.org/GeoCoordinates (984,300)
http://schema.org/Offer (856,860)
962.7 MBschema_Event.gz (sample)
http://schema.org/GeoCoordinates Quads: 282,908,884
URLs: 3,485,616
Hosts: 71,894
http://schema.org/PostalAddress (6,781,030)
http://schema.org/GeoCoordinates (6,493,653)
http://schema.org/ImageObject (4,482,183)
http://schema.org/ListItem (3,975,474)
http://schema.org/OpeningHoursSpecification (3,724,531)
3.7 GBschema_GeoCoordinates.gz (sample)
http://schema.org/GovernmentOrganization Quads: 1,430,443
URLs: 18,182
Hosts: 151
http://schema.org/PostalAddress (38,579)
http://schema.org/GovernmentOrganization (28,062)
http://schema.org/PropertyValue (25,644)
http://schema.org/ImageObject (24,926)
http://schema.org/ListItem (21,550)
16.6 MBschema_GovernmentOrganization.gz (sample)
http://schema.org/Hospital Quads: 2,257,355
URLs: 32,801
Hosts: 473
http://schema.org/Hospital (126,490)
http://schema.org/AggregateRating (110,212)
http://schema.org/PostalAddress (25,028)
http://schema.org/ContactPoint (23,263)
http://schema.org/WebSite (17,240)
29.7 MBschema_Hospital.gz (sample)
http://schema.org/Hotel Quads: 15,713,081
URLs: 268,080
Hosts: 2,749
http://schema.org/LocationFeatureSpecification (703,660)
http://schema.org/ImageObject (475,884)
http://schema.org/ListItem (400,062)
http://schema.org/PostalAddress (333,920)
http://schema.org/Hotel (304,900)
233.4 MBschema_Hotel.gz (sample)
http://schema.org/JobPosting Quads: 11,861,811
URLs: 317,904
Hosts: 1,570
http://schema.org/Organization (493,516)
http://schema.org/Place (388,242)
http://schema.org/JobPosting (386,451)
http://schema.org/PostalAddress (376,328)
http://schema.org/MonetaryAmount (287,927)
455.5 MBschema_JobPosting.gz (sample)
http://schema.org/LakeBodyOfWater Quads: 7,131
URLs: 412
Hosts: 7
http://schema.org/LakeBodyOfWater (422)
http://schema.org/GeoCoordinates (405)
http://schema.org/PostalAddress (305)
http://schema.org/Country (291)
http://schema.org/ListItem (33)
94.9 kBschema_LakeBodyOfWater.gz (sample)
http://schema.org/LandmarksOrHistoricalBuildings Quads: 11,467
URLs: 447
Hosts: 11
http://schema.org/LandmarksOrHistoricalBuildings (466)
http://schema.org/GeoCoordinates (416)
http://schema.org/PostalAddress (380)
http://schema.org/WebSite (341)
http://schema.org/SearchAction (341)
174.3 kBschema_LandmarksOrHistoricalBuildings.gz (sample)
http://schema.org/Language Quads: 54,692,185
URLs: 534,112
Hosts: 1,522
http://schema.org/Person (2,099,068)
http://schema.org/Comment (1,812,394)
http://schema.org/PropertyValue (1,184,761)
http://schema.org/ListItem (966,036)
http://schema.org/InteractionCounter (898,463)
1.1 GBschema_Language.gz (sample)
http://schema.org/Library Quads: 10,769,250
URLs: 282,847
Hosts: 122
http://schema.org/Person (662,165)
http://schema.org/Library (547,316)
http://schema.org/Offer (264,012)
http://schema.org/Product (264,012)
http://schema.org/ItemAvailability (264,011)
190.9 MBschema_Library.gz (sample)
http://schema.org/LocalBusiness Quads: 172,149,277
URLs: 5,878,341
Hosts: 248,966
http://schema.org/LocalBusiness (6,738,813)
http://schema.org/WebSite (4,226,197)
http://schema.org/Organization (4,187,771)
http://schema.org/PostalAddress (3,146,212)
http://schema.org/OpeningHoursSpecification (2,628,105)
3 GBschema_LocalBusiness.gz (sample)
http://schema.org/Mountain Quads: 202
URLs: 13
Hosts: 6
http://schema.org/Mountain (14)
http://schema.org/GeoCoordinates (12)
http://schema.org/AggregateRating (4)
http://schema.org/Person (3)
http://schema.org/Organization (2)
4.1 kBschema_Mountain.gz (sample)
http://schema.org/Movie Quads: 19,296,159
URLs: 216,852
Hosts: 349
http://schema.org/Person (2,154,482)
http://schema.org/Organization (424,346)
http://schema.org/Movie (335,110)
http://schema.org/ImageObject (236,163)
http://schema.org/Review (229,346)
329.5 MBschema_Movie.gz (sample)
http://schema.org/Museum Quads: 220,143
URLs: 5,251
Hosts: 44
http://schema.org/OpeningHoursSpecification (13,160)
http://schema.org/PostalAddress (6,073)
http://schema.org/Museum (5,496)
http://schema.org/GeoCoordinates (3,154)
http://schema.org/WebPage (1,931)
2.7 MBschema_Museum.gz (sample)
http://schema.org/MusicAlbum Quads: 45,193,254
URLs: 451,929
Hosts: 108
http://schema.org/MusicRecording (3,153,267)
http://schema.org/Offer (2,429,526)
http://schema.org/AudioObject (2,242,093)
http://schema.org/MusicAlbum (1,429,141)
http://schema.org/Country (978,008)
500 MBschema_MusicAlbum.gz (sample)
http://schema.org/MusicRecording Quads: 43,919,021
URLs: 396,803
Hosts: 98
http://schema.org/MusicRecording (3,424,237)
http://schema.org/Offer (2,752,693)
http://schema.org/AudioObject (2,447,727)
http://schema.org/MusicAlbum (1,191,219)
http://schema.org/EntryPoint (465,999)
475.2 MBschema_MusicRecording.gz (sample)
http://schema.org/Organization Quads: 2,330,338,405
URLs: 115,345,403
Hosts: 1,349,434
http://schema.org/Organization (116,610,888)
http://schema.org/ImageObject (63,161,524)
http://schema.org/ListItem (47,090,381)
http://schema.org/Person (36,967,548)
http://schema.org/WebPage (27,732,643)
48.6 GBschema_Organization.gz
schemaJSON_Organization_chunks.list
(sample)
http://schema.org/Painting Quads: 90,581
URLs: 1,917
Hosts: 15
http://schema.org/Painting (3,282)
http://schema.org/Product (3,055)
http://schema.org/Person (3,001)
http://schema.org/thing (2,999)
http://schema.org/Offer (2,542)
1.2 MBschema_Painting.gz (sample)
http://schema.org/Park Quads: 26,196
URLs: 698
Hosts: 22
http://schema.org/PostalAddress (985)
http://schema.org/Park (773)
http://schema.org/GeoCoordinates (767)
http://schema.org/OpeningHoursSpecification (485)
http://schema.org/ListItem (315)
354.4 kBschema_Park.gz (sample)
http://schema.org/Person Quads: 1,733,170,226
URLs: 64,301,806
Hosts: 335,768
http://schema.org/ListItem (68,467,039)
http://schema.org/Person (68,094,990)
http://schema.org/ImageObject (52,207,763)
http://schema.org/Organization (32,139,431)
http://schema.org/WebPage (20,001,702)
34.9 GBschema_Person.gz
schemaJSON_Person_chunks.list
(sample)
http://schema.org/Place Quads: 288,665,556
URLs: 3,734,134
Hosts: 66,395
http://schema.org/Place (9,671,181)
http://schema.org/PostalAddress (8,040,281)
http://schema.org/ListItem (4,486,900)
http://schema.org/ImageObject (4,069,468)
http://schema.org/SportsTeam (3,589,197)
3.9 GBschema_Place.gz (sample)
http://schema.org/Product Quads: 409,860,145
URLs: 9,280,437
Hosts: 40,128
http://schema.org/Product (20,438,975)
http://schema.org/Offer (15,387,909)
http://schema.org/ListItem (10,303,558)
http://schema.org/Organization (5,701,030)
http://schema.org/BreadcrumbList (2,477,228)
6.9 GBschema_Product.gz(sample)
http://schema.org/RadioStation Quads: 348,966
URLs: 22,757
Hosts: 44
http://schema.org/RadioStation (22,959)
http://schema.org/AggregateRating (19,208)
http://schema.org/EntryPoint (5,416)
http://schema.org/WebSite (2,870)
http://schema.org/SearchAction (2,751)
5.2 MBschema_RadioStation.gz (sample)
http://schema.org/Recipe Quads: 31,826,504
URLs: 619,875
Hosts: 1,635
http://schema.org/Recipe (656,184)
http://schema.org/Person (645,036)
http://schema.org/ImageObject (569,904)
http://schema.org/Organization (414,622)
http://schema.org/SiteNavigationElement (372,278)
704.4 MBschema_Recipe.gz (sample)
http://schema.org/Restaurant Quads: 37,004,205
URLs: 215,974
Hosts: 3,027
http://schema.org/Offer (3,658,745)
http://schema.org/Product (1,978,114)
http://schema.org/MenuItem (1,778,562)
http://schema.org/OfferCatalog (451,454)
http://schema.org/OpeningHoursSpecification (278,970)
335.1 MBschema_Restaurant.gz (sample)
http://schema.org/RiverBodyOfWater Quads: 37,265
URLs: 551
Hosts: 5
http://schema.org/ImageObject (1,616)
http://schema.org/ListItem (1,614)
http://schema.org/Organization (1,037)
http://schema.org/RiverBodyOfWater (558)
http://schema.org/GeoCoordinates (546)
324 kBschema_RiverBodyOfWater.gz (sample)
http://schema.org/School Quads: 3,606,983
URLs: 54,404
Hosts: 227
http://schema.org/ListItem (180,781)
http://schema.org/PostalAddress (65,736)
http://schema.org/Review (61,272)
http://schema.org/School (60,045)
http://schema.org/Rating (57,601)
47 MBschema_School.gz (sample)
http://schema.org/ShoppingCenter Quads: 469,020
URLs: 8,397
Hosts: 179
http://schema.org/OpeningHoursSpecification (23,560)
http://schema.org/PostalAddress (10,703)
http://schema.org/ShoppingCenter (9,907)
http://schema.org/GeoCoordinates (7,708)
http://schema.org/ContactPoint (4,921)
4.1 MBschema_ShoppingCenter.gz (sample)
http://schema.org/SkiResort Quads: 33,971
URLs: 660
Hosts: 39
http://schema.org/PostalAddress (991)
http://schema.org/SkiResort (922)
http://schema.org/GeoCoordinates (761)
http://schema.org/Organization (633)
http://schema.org/ContactPoint (458)
304.1 kBschema_SkiResort.gz (sample)
http://schema.org/SportsEvent Quads: 62,746,469
URLs: 141,006
Hosts: 383
http://schema.org/SportsTeam (3,576,216)
http://schema.org/PostalAddress (2,387,160)
http://schema.org/Place (2,370,834)
http://schema.org/SportsEvent (1,829,267)
http://schema.org/PlayAction (1,761,492)
569 MBschema_SportsEvent.gz (sample)
http://schema.org/SportsTeam Quads: 66,561,666
URLs: 198,642
Hosts: 166
http://schema.org/SportsTeam (3,739,201)
http://schema.org/Place (2,981,658)
http://schema.org/PostalAddress (2,384,753)
http://schema.org/SportsEvent (1,776,909)
http://schema.org/PlayAction (1,761,492)
596.7 MBschema_SportsTeam.gz (sample)
http://schema.org/StadiumOrArena Quads: 6,797,294
URLs: 28,638
Hosts: 48
http://schema.org/Place (742,388)
http://schema.org/Organization (91,519)
http://schema.org/Person (90,948)
http://schema.org/Review (90,848)
http://schema.org/ImageObject (85,057)
55.7 MBschema_StadiumOrArena.gz (sample)
http://schema.org/TVEpisode Quads: 8,853,972
URLs: 101,312
Hosts: 99
http://schema.org/Person (774,824)
http://schema.org/TVEpisode (295,972)
http://schema.org/Country (251,007)
http://schema.org/ImageObject (107,887)
http://schema.org/TVSeason (97,589)
143.7 MBschema_TVEpisode.gz (sample)
http://schema.org/TelevisionStation Quads: 269,718
URLs: 5,637
Hosts: 16
http://schema.org/ListItem (9,587)
http://schema.org/Organization (5,856)
http://schema.org/TelevisionStation (5,659)
http://schema.org/PostalAddress (5,655)
http://schema.org/GeoCoordinates (5,634)
2.8 MBschema_TelevisionStation.gz (sample)

In case you are interested in a particular class or set of classes which is not listed above, please get in contact with the WebDataCommons team via Mailing List or our Google Group.

Get the Code

The source code can be checked out from our Subversion repository. The extraction of November 2018 was done with version 1.0.4 of the extractor. For more information about the framework and a detailed description how to run a own extraction visit the framework page.

Get Support

Please send questions and feedback to the Web Data Commons mailing list or post them in our Web Data Commons Google Group.