73
Seman&c Analysis in Language Technology http://stp.lingfil.uu.se/~santinim/sais/2014/sais_2014.htm The Semantic Web & Ontologies Marina San(ni [email protected]fil.uu.se Department of Linguis(cs and Philology Uppsala University, Uppsala, Sweden Autumn 2014 1 The Seman(c Web & Ontologies

Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Seman&c  Analysis  in  Language  Technology  http://stp.lingfil.uu.se/~santinim/sais/2014/sais_2014.htm

The Semantic Web & Ontologies

Marina  San(ni  [email protected]  

 Department  of  Linguis(cs  and  Philology  Uppsala  University,  Uppsala,  Sweden  

 Autumn  2014  

1  The  Seman(c  Web  &  Ontologies  

Page 2: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Acknowledgements  

•  Slides  inspired  by  Ian  Harrockss.  

The  Seman(c  Web  &  Ontologies   2  

Page 3: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Summary:  QA  (i)  •  Google,  Yahoo,  Bing…  •  ”Tradi(onal”  Ques(on  Answering  (Start…  ):    

–  hSp://start.csail.mit.edu/publica(ons/FLAIRS0601KatzB.pdf  (2006)  –  Other  publica(ons:  hSp://start.csail.mit.edu/publica(ons.php    

   

The  Seman(c  Web  &  Ontologies   3  

Page 4: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Katz  et  al.  (2006)  hSp://start.csail.mit.edu/publica(ons/FLAIRS0601KatzB.pdf    

•  START  answers  natural  language  ques(ons  by  presen(ng  components  of  text  and  mul(-­‐media  informa(on  drawn  from  a  set  of  informa(on  resources  that  are  hosted  locally  or  accessed  remotely  through  the  Internet.    

•  START  targets  high  precision  in  its  ques(on  answering.    

•  The  START  system  analyzes  English  text  and  produces  a  knowledge  base  which  incorporates,  in  the  form  of  nested  ternary  expressions  (=triples),  the  informa(on  found  in  the  text.  

The  Seman(c  Web  &  Ontologies   4  

Page 5: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Siri  hSp://en.wikipedia.org/wiki/Siri    

•  Siri  /ˈsɪri/  is  an  intelligent  personal  assistant  and  knowledge  navigator  which  works  as  an  applica(on  for  Apple  Inc.'s  iOS.  

•   The  applica(on  uses  a  natural  language  user  interface  to  answer  ques(ons,  make  recommenda(ons,  and  perform  ac(ons  by  delega$ng  requests  to  a  set  of  Web  services.    

•  The  soeware,  both  in  its  original  version  and  as  an  iOS  applica(on,  adapts  to  the  user's  individual  language  usage  and  individual  searches  (preferences)  with  con(nuing  use,  and  returns  results  that  are  individualized.    

•  The  name  Siri  is  Scandinavian,  a  short  form  of  the  Norse  name  Sigrid  meaning  "beauty"  and  "victory",  and  comes  from  the  intended  name  for  the  original  developer's  first  child.  

The  Seman(c  Web  &  Ontologies   5  

Page 6: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Summary  (ii)  •  Siri…  conversa(onal  ”safety  net”.  •  Conversa(onal  agents  (chat  bots,  

and  personal  assistants)    

àcustomer  care,  customer  analy(cs  (replacing/integra(ng  FAQs  and  help  desk)  

The  Seman(c  Web  &  Ontologies   6  

Avatar:  a  picture  of  a  person  or  animal  that  represents  you  on  a  computer  screen,  for  example  in  some  chat  rooms  or  when  you  are  playing  games  over  the  Internet  

Page 7: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Eliza  hSp://en.wikipedia.org/wiki/ELIZA  

ELIZA  was  wriSen  at  MIT  by  Joseph  Weizenbaum  between  1964  and  1966    

The  Seman(c  Web  &  Ontologies   7  

Page 8: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Spoken  Language  •  Spoken  language:  incorrect  syntax,  incorrect  morpology,  

spoken  forms….    

•  syntac(c  mechanisms  like  disloca(on,  anaphora,  and  gapping;    

•  morphological  mechanisms  like  specialized  focus  or  topic-­‐marking  affixes;    

•  specialized  discourse  par(cles.    •  Ex:  

–  ‘this  man,  that  I  have  not  yet  seen’  (lee  disloca(on)  –  It  is  a  strange  bloke,  that  man  (right  disloca(on)  –  ‘this  man,  I  have  not  yet  seen  him’  (hanging  topic)  

The  Seman(c  Web  &  Ontologies   8  

Page 9: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Jakobson's  func(ons  of  language  •  The  Pha$c  Func$on  is  language  for  the  sake  of  interac(on.    

•  The  Pha(c  Func(on  can  be  observed  in  gree(ngs  and  casual  discussions  of  the  weather,  par(cularly  with  strangers.    

•  It  also  provides  the  keys  to  open,  maintain,  verify  or  close  the  communica(on  channel:  "Hello?",  "Ok?",  "Hummm",  "Bye"...  

•  Academic  interac(on….  

The  Seman(c  Web  &  Ontologies   9  

Page 10: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Visionary  People  (i)  

•  Roberto  Busa  and  Thomas  Watson  

•  QA  IBM  Watson  à  Jeopardy!  (quizzes)  

•  Interesseklubben    (difficult  quizzes)  -­‐-­‐-­‐  >  Bill  Gates?,  Kamprad?,  etc  J  J  J  

•  Philanthropist  Billionaires  –  As  of  2007,  Bill  and  Melinda  Gates  were  the  second-­‐most  generous  philanthropists  in  America,  having  given  

over  US$28  billion  to  charity;  the  couple  plan  to  eventually  donate  95  percent  of  their  wealth  to  charity  (why  not  to  research  too?)  

–  Bill  and  Melinda  Gates  have  taken  the  No.  1  spot  on  Forbes'  list  of  the  50  top  givers  in  America.  (Oct  2014)  

•  (acquisi(ons  is  the  Codex  Leicester,  a  collec(on  of  wri(ngs  by  Leonardo  da  Vinci,  which  Gates  bought  for  $30.8  million  at  an  auc(on  in  1994)  

The  Seman(c  Web  &  Ontologies   10  

Page 11: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Funding  sources  in  Sweden  

•  Vetenskapsrådet  •  Riksbanken  

•  Vinnova    

•  Crowdfunding  

The  Seman(c  Web  &  Ontologies   11  

Page 12: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Outline  

•  The  Seman(c  Web  

•  Ontologies  

The  Seman(c  Web  &  Ontologies   12  

Page 13: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Chronology  hSp://en.wikipedia.org/wiki/

History_of_the_World_Wide_Web    

•  On  August  6,  1991,Berners-­‐Lee  posted  a  short  summary  of  the  World  Wide  Web  project  on  the  alt.hypertext  newsgroup,  invi(ng  collaborators.  This  date  also  marked  the  debut  of  the  Web  as  a  publicly  available  service  on  the  Internet,  although  new  users  could  only  access  it  aNer  August  23.  

•  Beginning  in  2002,  new  ideas  for  sharing  and  exchanging  content  ad  hoc,  such  as  Weblogs  and  RSS,  rapidly  gained  acceptance  on  the  Web.  This  new  model  for  informa(on  exchange,  primarily  featuring  user-­‐generated  and  user-­‐edited  websites,  was  dubbed  Web  2.0.    

•  Popularized  by  Berners-­‐Lee's  book  Weaving  the  Web  (2000)  and  a  Scien(fic  American  ar(cle  by  Berners-­‐Lee,  James  Hendler,  and  Ora  Lassila,  the  term  Seman(c  Web  describes  an  evolu(on  of  the  exis(ng  Web  in  which  the  network  of  hyperlinked  human-­‐readable  web  pages  is  extended  by  machine-­‐readable  metadata  about  documents  and  how  they  are  related  to  each  other,  enabling  automated  agents  to  access  the  Web  more  intelligently  and  perform  tasks  on  behalf  of  users.  This  has  yet  to  happen.  In  2006,  Berners-­‐Lee  and  colleagues  stated  that  the  idea  "remains  largely  unrealized"  

The  Seman(c  Web  &  Ontologies   13  

Visionary  people  (ii)  

Page 14: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Web  1.0  

•  Web  1.0  is  a  retronym  referring  to  an  early  stage  of  the  World  Wide  Web's  evolu(on.  

•  Some  design  elements  of  a  Web  1.0  site  include:  

–  Personal  web  pages  were  common,  consis(ng  mainly  of  sta(c  pages  

–  Sta(c  pages  instead  of  dynamic  HTML.  –  The  use  of  HTML  3.2-­‐era  elements  such  as  Framing  (World  Wide  Web)s  and  tables  to  posi(on  and  align  elements  on  a  page    (now  we  use  css  and  frames  are  deprecated)  

–  GIF  buSons...  

The  Seman(c  Web  &  Ontologies   14  

Page 15: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Web  2.0  •  Web  2.0  describes  World  Wide  Web  sites  that  use  technology  beyond  the  

sta(c  pages  of  earlier  Web  sites.    •  The  key  features  of  Web  2.0  include:  

–  Tagging  -­‐  allows  users  to  collec(vely  classify  and  find  informa(on  (e.g.  Tagging)  –  Rich  User  Experience-­‐  dynamic  content;  responsive  to  user  input  –  User  Par(cipa(on  -­‐  informa(on  flows  two  ways  between  site  owner  and  site  

user  by  means  of  evalua(on,  review,  and  commen(ng.    –  Site  users  add  content  for  others  to  seeLong  tail-­‐  services  offered  on  demand  

basis;  profit  is  realized  through  monthly  service  subscrip(ons  more  than  one-­‐(me  purchases  of  goods  over  the  network[cita(on  needed]  

–  Soeware  as  a  service  -­‐  Web  2.0  sites  developed  API  to  allow  automated  usage,  such  as  by  an  app  or  mashup  

–  Mass  Par(cipa(on  -­‐  Universal  web  access  leads  to  differen(a(on  of  concerns  from  the  tradi(onal  internet  userbase.  

The  Seman(c  Web  &  Ontologies   15  

Page 16: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Web  3.0  

•  “Web  3.0,  a  phrase  coined  by  John  Markoff  of  the  New  York  Times  in  2006,  refers  to  a  supposed  third  genera(on  of  Internet-­‐based  services  that  collec(vely  comprise  what  might  be  called  ‘the  intelligent  Web’  —  such  as  those  using  seman(c  web,  microformats,  natural  language  search,  data-­‐mining,  machine  learning,  recommenda(on  agents,  and  ar(ficial  intelligence  technologies  —  which  emphasize  machine-­‐facilitated  understanding  of  informa(on  in  order  to  provide  a  more  produc(ve  and  intui(ve  user  experience.”  

•  Web  3.0  will  be  more  connected,  open,  and  intelligent,  with  seman(c  Web  technologies,  distributed  databases,  natural  language  processing,  machine  learning,  machine  reasoning,  and  autonomous  agents.  

–  hSp://lifeboat.com/ex/web.3.0    

The  Seman(c  Web  &  Ontologies   16  

Page 17: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

•  "The  Web  was  designed  as  an  informa$on  space,  with  the  goal  that  it  should  be  useful  not  only  for  human-­‐human  communica(on,  but  also  that  machines  would  be  able  to  par(cipate  and  help.    

•  One  of  the  major  obstacles  to  this  has  been  the  fact  that  most  informa$on  on  the  Web  is  designed  for  human  consump$on,  and  even  if  it  was  derived  from  a  database  with  well  defined  meanings  (in  at  least  some  terms)  for  its  columns,  that  the  structure  of  the  data  is  not  evident  to  a  robot  browsing  the  Web.    

•  Leaving  aside  the  ar(ficial  intelligence  problem  of  training  machines  to  behave  like  people,  the  Seman$c  Web  approach  instead  develops  languages  for  expressing  informa$on  in  a  machine  process-­‐able  form"-­‐  

–  Tim  Berners-­‐Lee,  The  Seman<c  Web  Roadmap,  1998  –  hSp://www.w3.org/DesignIssues/Seman(c.html    

The  Seman(c  Web  &  Ontologies   17  

The  web:  present  and  future  

Page 18: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Today…  

•  The  web  is  rela(vely  simple:  – Hypertexts  and  hypermedia  – Access  is  engineered  via  a  combina(on  of  keyword-­‐based  search  and  link  nagiva(on.  

This  simplicity  has  been  one  of  the  great  strengths  of  the  web,  and  has  been  an  important  factor  in  its  popularity  and  their  own  content.    

The  Seman(c  Web  &  Ontologies   18  

Page 19: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Shortcomings  Examples:  •  Finding  informa(on  about  people  with  very  common  names    can  be  a  frustra(ng  experience.  

   •  Answering  more  complex  queries  along  with  more  general  informa(on  retrieval,  integra(on,  sharing  and  processing  can  be  difficult  or  even  impossible.  –  List  of  all  the  heads  of  a  state  of  EU  countries  – Who  destroyed  the  Beatles?  

The  Seman(c  Web  &  Ontologies   19  

Page 20: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Some  solu(ons    •  Soeware  glue:  Mashups  

–  loca(on  informa(on  from  one  source  might  be  combined  with  map  informa(on  from  another  source  in  order  to  show  the  loca(on  of  and  provide  direc(ons  to  points  of  interest  such  as  hotels  and  restaurants.  

•  Tagging  via  social  networks  (Web  2.0)  –  harness  the  power  of  user  communi(es  in  order  to  share  and  annotate  informa(on.  

•  Examples  include  image  and  video  shar-­‐ing  sites  such  as  Flickr  and  YouTube,  and  auc(on  sites  such  as  eBay.    

–  In  these  applica(ons,  annota(ons  usually  take  the  form  simple  tags,  such  as  ”each",  ”birthday",  ”family"  and  ”friends".  The  meaning  of  tags  is,  however,  typically  not  well  defined,  and  may  be  impenetrable  even  to  human  users:  typ-­‐ical  examples  (from  Flickr)  include  "asquatchmusicfes(val",  "elebritylookalikes",  and  "wab08".  

The  Seman(c  Web  &  Ontologies   20  

Page 21: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

The  ”travel  agent”  

•  The  classic  example  of  a  seman(c  web  applica(on  is  an  automated  travel  agent  that,  given  various  constraints  and  preferences,  would  offer  the  user  suitable  travel  or  vaca(on  sugges(ons.    

•  A  key  feature  of  such  a  "soeware  agent"  is  that  it  would  not  simply  exploit  a  predetermined  set  of  informa(on  sources,  but  would  search  the  web  for  relevant  informa(on  in  much  the  same  way  that  a  human  user  might  do  when  planning  a  vaca(on.  

The  Seman(c  Web  &  Ontologies   21  

Page 22: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

The  goal  

•  The  goal  of  th  Seman(c  Web  is  to  allow  web  informa(on  and  services  to  be  more  effec(vely  exploited  by  humans  and  automated  tools.    

   

The  Seman(c  Web  &  Ontologies   22  

Page 23: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Seman(c  Web  •  The  focus  of  the  seman(c  web  is  to  share  data  instead  of  documents.    

•  In  other  words,  it  is  a  project  that  should  provide  a  common  framework  that  allows  data  to  be  shared  and  reused  across  applica(on,  enterprise,  and  community  boundaries.    

•  It  is  a  collabora(ve  effort  led  by  World  Wide  Web  Consor(um  (W3C).  

The  Seman(c  Web  &  Ontologies   23  

Page 24: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Semantic Web & Ontologies •  How  are  we  going  to  represent  meaning  and  knowledge  on  the  web?  

•  A  key  idea  behind  the  seman<c  web  is  to  address  this  problem  by  giving  machine-­‐accessible  seman<cs  via  annota<on.    

•  Knowledge  is  represented  in  the  form  of  rich  conceptual  schemas  called  ontologies.    

•  Ontologies  are  the  backbone  of  the  Seman(c  Web.  

•  Ontologies  are  rich  conceptual  schemas  that  give  formally  defined  meanings  to  the  terms  used  in  annota<ons,  transforming  them  into  seman<c  annota<ons.  

•  They  provide  the  knowledge  that  is  required  for  seman(c  applica(ons  of  all  kinds.     24 The  Seman(c  Web  &  Ontologies  

Page 25: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Main  Difficulty  

•  Current  web  content  is  intended  for  humans  (HTML  markup  with  layout,  images  and  other  presenta(onal  features).    

•  Humans  understand  this  content,  but  machines  can’t.  

The  Seman(c  Web  &  Ontologies   25  

Page 26: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Basically... •  Ontologies provide a shared understanding of a domain.

•  They provide background knowledge to systems to automatize certain tasks.

•  By the process of annotation, knowledge can be linked to ontologies. –  Example: “Angelina Jolie” (Text) linked to concept Actress –  In our ontology we also know that an actress always is female and a

person.

•  Ontologies allow the creation of annotations à machine-readable and machine-understandable content.

•  If machines can understand content, they can also perform more meaningful and intelligent queries. –  Distinction of Jaguar the animal and the car. –  Combination of information that is distributed on the Web.

26 The  Seman(c  Web  &  Ontologies  

Page 27: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Old  and  New  Issues  Old  ones:  •  knowledge  representa(on    •  Reasoning  •  Linguis(cs    •  …  

New  ones:  •  integra(ng  different  ontologies  may  prove  to  be  at  least  as  

hard  as  integra(ngthe  resources  that  they  describe    •  Crea(on  of  suitable  annota(ons  and  ontologies  •  …  

The  Seman(c  Web  &  Ontologies   27  

Page 28: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Notwithstanding  these  issues…  

•  …  considerable  progress  has  been  made  in  the  development  of  the  infrastructure  needed  to  support  the  seman(c  web.    

•  In  par(cular,  there  has  been  impressive  progress  in  the  development  of  languages  and  tools  for  content  annota(on  and  for  the  design  and  deployment  of  ontologies.  

The  Seman(c  Web  &  Ontologies   28  

Page 29: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Seman(c  Annota(on  

•  To  facilitate  the  process  of  seman(c  annota(on,  RDF  and  OWL  have  been  developed  as  standard  formats  fo  the  sharing  and  integra(on  of  data  and  knowledge.  

•  RDF  and  OWL  are  standards:  – RDF  (Resource  Descrip(on  Framework)  – OWL  (Web  Ontology  Language)  

The  Seman(c  Web  &  Ontologies   29  

Page 30: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Ontologies  (Metaphysics)  

•  Ontology,  in  its  original  philosophical  sense,  is  a  fundamental  branch  of  metaphysics  focusing  on  the  study  of  existence.  

•  Its  objec(ve  is  to  determine  what  en((es  and  types  of  en((es  actually  exist,  and  thus  to  study  the  structure  of  the  world.    

•  The  study  of  ontology  can  be  traced  back  to  the  work  of  Plato  and  Aristotle,  and  includes  the  development  of  hierarchical  categorisa(ons  of  different  kinds  of  en((es  and  the  features  that  dis(nguish  them  

The  Seman(c  Web  &  Ontologies   30  

Tree  of  Porphyry  

Page 31: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Tree  of  Porphyry,    III  AD  

•  The  Porphyrian  tree,  Tree  of  Porphyry  or  Arbor  Porphyriana  is  a  classic  device  for  illustra(ng  what  is  also  called  a  "scale  of  being".  It  was  suggested  by  the  3rd  century  AD  Greek  neoplatonist  philosopher  and  logician  Porphyry    

The  Seman(c  Web  &  Ontologies   31  

Page 32: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Ontology  (Computer  Science,  AI,  LT,  IR…)  

•  Engineering  artefact,  usually  a  model  of  some  aspect  of  the  world.  

•  It  introduces  vocabulary  describing  various  aspects  of  the  domain  being  modelled,  and  provides  an  explicit  specifica(on  of  the  intended  meaning  of  the  vocabulary.    

•  This  specifica(on  oeen  includes  classifica(on-­‐based  informa(on,  not  unlike  that  in  Porphyry's  tree.    

The  Seman(c  Web  &  Ontologies   32  

Page 33: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

What is an ontology (i)?

33

“An  ontology  is  a  formal,  explicit  specifica<on  of  a    shared  conceptualiza<on”  

Studer,  Benjamins,  Fensel.  Knowledge  Engineering:  Principles  and  Methods.  Data  and  Knowledge  Engineering.  25  (1998)  161-­‐197  

 

An ontology is an explicit specification of a conceptualization Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220

 

Abstract model and simplified view of some phenomenon in the world that we want to represent

Machine-readable

Concepts, properties relations, functions, constraints, axioms, are explicitly defined

Consensual Knowledge

The  Seman(c  Web  &  Ontologies  

Page 34: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

What is an ontology (ii)? •  An ontology is a hierarchically structured set of terms for describing a

domain that can be used as a skeletal foundation for a knowledge base

B. Swartout; R. Patil; k. Knight; T. Russ. Toward Distributed Use of Large-Scale Ontologies Ontological Engineering. AAAI-97 Spring Symposium Series. 1997. 138-148

•  An ontology defines the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary

Neches, R.; Fikes, R.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; Swartout, W.R. Enabling Technology for Knowledge Sharing. AI Magazine. Winter 1991. 36-56

•  An ontology provides the means for describing explicitly the conceptualization behind the knowledge represented in a knowledge base

A. Bernaras;I. Laresgoiti; J. Correra. Building and Reusing Ontologies for Electrical Network Applications ECAI96. 12th European conference on Artificial Intelligence. Ed. John Wiley & Sons, Ltd.

298-302

34 The  Seman(c  Web  &  Ontologies  

Page 35: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Examples  •  Top  level  ontology:  Standard  Upper  Ontology  

–  In  informa(on  science,  an  upper  ontology  (also  known  as  a  top-­‐level  ontology  or  founda(on  ontology)  is  an  ontology  (in  the  sense  used  in  informa(on  science)  which  describes  very  general  concepts  that  are  the  same  across  all  knowledge  domains.  

•  Linguis(c  ontology:  WordNet  •  General  Ontology:  Cyc,  UNSPSC,  ecl@ss  •  Domain  ontology:  MeSH  (Medical  Subject  Headings),  

CHEMICALS,  UMLS  •  Research  ontology:  KA2  (Knowledge  Acquisi(on  

Community  Ontology)  

The  Seman(c  Web  &  Ontologies   35  

Page 36: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Resource  Descrip(on  Framework  (i)  

•  A  language  that  has  been  developed  in  order  to  provide  a  extensible  mechanism  for  describing  web  resources  and  rela(onships  between  them.    

•  A  key  feature  of  RDF  is  the  use  of  Interna(onalized  Resource  Iden(fiers  (IRIs)  (which  is  a  generalisa(on  of  Uniform  Resource  Locators  (URLs)  to  refer  to  resources.    

•  RDF  is  a  very  simple  language:  its  underlying  data  structure  is  a  labelled  directed  graph,  and  its  only  syntac(c  construct  is  the  triple.    

•  A  triple  consists  of  three  components,  referred  to  as  the  subject,  predicate  and  object.  

The  Seman(c  Web  &  Ontologies   36  

a  directed  graph  is  a  set  of  nodes  connected  by  edges,  where  the  edges  have  a  direc(on  associated  with  them.  

/ˈaɪˌɑːˌraɪ/  

Page 37: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

RDF  (ii)  •  More  formally,    a  triple  represents  a  single  edge  (labelled  

with  the  predicate)  connec(ng  two  nodes  (labelled  with  the  subject  and  object);  it  describes  a  binary  rela(onship  between  the  subject  and  object  via  the  predicate.    

•  The  predicate  of  a  triple  is  always  an  IRI,  and  an  IRI  that  is  used  in  the  predicate  posi(on  of  a  triple  is  called  a  property.    

•  A  set  of  triples  is  called  an  RDF  graph.    

•  In  order  to  facilitate  the  sharing  and  exchanging  of  graphs  on  the  web,  an  XML  serialisa(on  has  also  been  defined.  

The  Seman(c  Web  &  Ontologies   37  

Page 38: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

”Harry  PoSer  has  a  pet  called  Hedwig…”  

The  Seman(c  Web  &  Ontologies   38  

RDF/XML  

RDF  graph  

Page 39: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  07:  Rela(on  Extrac(on:  Rela(on  databases  that  draw  from  Wikipedia  

•  Resource  Descrip<on  Framework  (RDF)  triples  subject  predicate  object  Golden Gate Park location San Francisco!dbpedia:Golden_Gate_Park      dbpedia-­‐owl:loca(on      dbpedia:San_Francisco!

•  DBPedia:  1  billion  RDF  triples,  385  from  English  Wikipedia  

•  Frequent  Freebase  rela(ons:  people/person/na(onality,                                                                loca(on/loca(on/contains    people/person/profession,                                                                  people/person/place-­‐of-­‐birth    biology/organism_higher_classifica(on                      film/film/genre  

39  The  Seman(c  Web  &  Ontologies  

Page 40: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  08:  Rela(on  Extrac(on  

•  Answers:  Databases  of  Rela(ons  –  born-­‐in(“Emma  Goldman”,  “June  27  1869”)  –  author-­‐of(“Cao  Xue  Qin”,  “Dream  of  the  Red  Chamber”)  –  Draw  from  Wikipedia  infoboxes,  DBpedia,  FreeBase,  etc.  

•  Ques(ons:  Extrac(ng  Rela(ons  in  Ques(ons  Whose  granddaughter  starred  in  E.T.?  

(acted-in ?x “E.T.”)! (granddaughter-of ?x ?y)!

40   The  Seman(c  Web  &  Ontologies  

Page 41: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

RDF  Schema  (RDF  Vocabulary  Descrip(on  Language)  •  Enxtends  RDF  by  giving  addi(onal  meaning  to  spiacial  resources…  

The  Seman(c  Web  &  Ontologies   41  

Page 42: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

…  but  s(ll  not  enough…  

•  Capabili(es  of  RDF  as  ontology  language  are  limited  – No  cardinality    – No  possible  to  describe  conjunc(on  of  classes  – …  

RDF  is  a  very  simple  language    

The  Seman(c  Web  &  Ontologies   42  

cardinality  of  a  set  is  a  measure  of  the  "number  of  elements  of  the  set”.  For  example,  the  set  A  =  {2,  4,  6}  contains  3  elements,  and  therefore  A  has  a  cardinality  of  3  

Page 43: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Need  for  a  more  expressive  ontology  language:  OWL  (Web  Ontology  Language)  

•  Since  the  architecture  of  the  web  depends  on  agreed  standards,  the  World  Wide  Web  Consor(um  (W3C)  set  up  a  standardisa(on  working  group  to  develop  a  standard  for  a  web  ontology  language.  

•   The  result  of  this  ac(vity  was  the  OWL  ontology  language  standard.  

•  The  integra(on  of  OWL  with  RDF  has  the  advantage  of  making  OWL  ontologies  directly  accessible  to  web  based  applica(ons.  

The  Seman(c  Web  &  Ontologies   43  

Page 44: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Back  Story:    hSp://ileriseviye.wordpress.com/2011/11/01/why-­‐web-­‐ontology-­‐language-­‐is-­‐abbreviated-­‐as-­‐owl-­‐and-­‐not-­‐wol/    

The  Seman(c  Web  &  Ontologies   44  

Page 45: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Descrip(on  Logics  (DLs)  

•  A  key  feature  of  OWL  is  its  basis  in  Descrip(on  Logics,  a  family  of  logic-­‐based  knowledge  representa(on  formalisms  that  have  a  formal  seman(cs  based  on  first-­‐order  logic  (FOL).  

The  Seman(c  Web  &  Ontologies   45  

Page 46: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  02:  Descrip(on  Logics  •  DLs  refer  to  a  family  of  logical  approaches  that  corrispond  to  

different  subsets  of  FOL.    

•  We  can  use  DLs  to  model  an  applica(on  domain.  The  focus  is  then  on:  –  Representa(on  of  knowledge  about  categories  –  The  set  of  categories  in  an  applica(on  domain  is  called  terminology  –  The  terminology  is  arranged  in  a  hierachical  organiza(on  called  ontology,  which  capture  superset  &  subset  rela(ons  among  categoires/concepts.    

–  In  order  to  specify  a  hierachical  structure,  we  can  use  subsump$on  rela(ons  betw  the  appropriate  concepts  in  a  terminiology    

–  Subsump$on  is  a  form  of  inference.  Determines  whether  a  superset/subset  rela(on  (based  on  the  fact  asserted  in  a  terminology)  exists  betw  two  concepts.  

The  Seman(c  Web  &  Ontologies   46  

Page 47: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  02:  OWL  and  the  Seman(c  Web  

•  A  Descrip(on  Logic  roughly  is  used  in  the  Web  Ontology  Language  (OWL).    

•  OWL  is  a  language  used  for  the  develoment  of  ontologies  that  should  encapsulate  the  knowledge  in  the  development  of  the  Seman(c  Web  

•  The  Seman(c  Web  is  the  effort  to  formally  specify  the  seman(cs  of  the  contents  of  the  web  .  

The  Seman(c  Web  &  Ontologies   47  

Page 48: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

DLs  •  These  formalisms  adopt  an  object-­‐oriented  model,  in  which  the  domain  is  described  in  terms  of  individuals  (instances),  concepts  (classes),  and  roles  (proper(es/predicates):  

–  individuals,  e.g.,  "Hedwig",  are  the  basic  elements  of  the  domain;    

–  concepts,  e.g.,  "Owl",  describe  sets  of  individuals  having  similar  characteris(cs;    

–  roles,  e.g.,  "hasPet",  describe  rela(onships  between  pairs  of  individuals,  such  as  "HarryPoSer  hasPet  Hedwig".  

The  Seman(c  Web  &  Ontologies   48  

Page 49: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Axioms  •  An  OWL  ontology  consists  of  a  set  of  axioms  

•  Exemple:    –  given  the  axiom  C  equivalentClass  D,  then  an  individual  is  an  instance  of  C  if  and  

only  if  it  is  an  instance  of  D.    –  i.e.  Combining  axioms  with  class  descrip(ons  allows  for  easy  extension  of  the  

vocabulary  by  introducing  new  names  as  abbrevia(ons  for  descrip(ons.    

See  the  following  axiom:    Class: HogwartsStudent!

!EquivalentTo: Student and attendsSchoolvalue Hogwarts!  introduces  the  class  name  HogwartsStudent,  and  asserts  that  its  instances  are  just  those  Students  who  aSend  Hogwarts.  

The  Seman(c  Web  &  Ontologies   49  

Page 50: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

TBox  &  ABox  

•  Axioms  describe  constraints  on  the  structure  of  the  domain:  –  in  DLs  such  a  set  of  axioms  is  called  a  TBox  (Terminology  Box).    

•  OWL  also  allows  for  axioms  asser(ng  facts  about  some  concrete  situa(on,  similar  to  data  in  a  database  se�ng:  –  in  DLs  such  a  set  of  axioms  is  called  an  ABox  (Asser(on  Box).  

The  Seman(c  Web  &  Ontologies   50  

Page 51: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Decid-­‐ability  (i)  

•  Descrip(on  Logics  are  fully-­‐fledged  logics  and  so  have  a  formal  seman(cs.  

•   DLs  can  be  seen  as  decidable  subsets  of  FOL  with:  –   individuals  being  equivalent  to  constants,    – concepts  to  unary  predicates,  –  roles  to  binary  predicates.    

The  Seman(c  Web  &  Ontologies   51  

Page 52: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  02:  But…  undecidable  (some(mes)  

•  The  Incompleteness  Theorem  ,  proven  in  1930,  demonstrates  that  first-­‐order  logic  is  in  general  undecidable.    

•  That  means  there  exist  statements  in  this  logic  form  that,  under  certain  condi(ons,  cannot  be  proven  either  true  or  false.  

•  Ex:  can’t  solve  the  Hal$ng  Problem  

The  Seman(c  Web  &  Ontologies   52  

Page 53: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Lect  02:  Hal(ng  Problem  •  In  1936  Alan  Turing  proved  that  it's  not  possible  to  decide  whether  

an  arbitrary  program  will  eventually  halt,  or  run  forever.    

•  The  official  defini(on  of  the  problem  is  to  write  a  program  (actually,  a  Turing  Machine*)  that  accepts  as  parameters  a  program  and  its  parameters.  That  program  needs  to  decide,  in  finite  (me,  whether  that  program  will  ever  halt  running  these  parameters.  

•  The  hal(ng  problem  is  a  cornerstone  problem  in  computer  science.  It  is  used  mainly  as  a  way  to  prove  a  given  task  is  impossible,  by  showing  that  solving  that  task  will  allow  one  to  solve  the  hal(ng  problem.  

*A  Turing  machine  is  a  hypothe(cal  device  that  manipulates  symbols  according  to  a  table  of  rules.  Despite  its  simplicity,  a  Turing  machine  can  be  adapted  to  simulate  the  logic  of  any  computer  algorithm,    

The  Seman(c  Web  &  Ontologies   53  

Page 54: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Decid-­‐ability  (ii)  

•  DLs  give  a  precise  and  unambiguous  meaning  to  descrip(ons  of  the  domain  

•  This  also  allows  for  the  development  of  reasoning  algorithms  that  can  provide  correct  answers  to  arbitrarily  complex  queries  about  the  domain.  

The  Seman(c  Web  &  Ontologies   54  

Page 55: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Reasoning:  OWL  vs  Databases  

•  Ex:  OWL  axioms  behave  like  inference  rules  rather  than  database  constraints.    

!Class: Phoenix!

!SubClassOf: isPetOf only Wizard!!Individual: Fawkes!

Types: Phoenix!Facts: isPetOf Dumbledore!

•  Fawkes  is  said  to  be  a  Phoenix  and  to  be  the  pet  of  Dumbledore,  and  it  is  also  stated  that  only  a  Wizard  can  have  a  pet  Phoenix.    

•  In  OWL,  this  leads  to  the  implica(on  that  Dumbledore  is  a  Wizard.  That  is,  if  we  were  to  query  the  ontology  for  instances  of  Wizard,  then  Dumbledore  would  be  part  of  the  answer.    

•  In  a  database  se�ng  the  schema  could  include  a  similar  statement  about  the  Phoenix  class,  but  in  this  case  it  would  be  interpreted  as  a  constraint  on  the  data:  adding  the  fact  that  Fawkes  isPetOf  Dumbledore  without  Dumbledore  being  already  known  to  be  a  Wizard  would  lead  to  an  invalid  database  state,  and  such  an  update  would  therefore  be  rejected  by  a  database  management  system  as  a  constraint  viola(on.  

The  Seman(c  Web  &  Ontologies   55  

Page 56: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Ontology  Development  Tools  

•  State  of  the  art  ontology  development  tools,  such  as  SWOOP,  Protégé,  and  TopBraid  Composer,  use  DL  reasoners  to  provide  feedback  to  the  user  about  the  logical  implica(ons  of  their  design:    –  i.e.  warnings  about  inconsistencies  and  synonyms.  

The  Seman(c  Web  &  Ontologies   56  

Page 57: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

WebProtégé  hSp://protege.stanford.edu/products.php#web-­‐protege    

 

The  Seman(c  Web  &  Ontologies   57  

Page 58: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

VOWL:    Visual  Nota(on  for  OWL  

Ontologies  hSp://vowl.visualdataweb.org/v2/    

 

The  Seman(c  Web  &  Ontologies   58  

Page 59: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

WebVOWL  hSp://vowl.visualdataweb.org    

•  A  web-­‐based  implementa(on  of  VOWL.    

•  There  is  also  a  VOWL  plugin  for  the  ontology  editor  Protégé  that  implements  the  VOWL  specifica(ons.      

The  Seman(c  Web  &  Ontologies   59  

Page 60: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Domain-­‐specific  ontologies  •  The  availability  of  tools  has  contributed  to  the  increasingly  widespread  use  of  OWL,  and  it  has  become  the  de  facto  standard  for  ontology  development  in  fields  as  diverse  as      –  Biology  – Medicine  – Geography  – Geology  – Agriculture    – Defence  –  etc  

The  Seman(c  Web  &  Ontologies   60  

Page 61: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Complex  Queries  •  The  use  of  DL  reasoners  allows  OWL  ontology  applica(ons  to  answer  complex  queries  and  to  provide  guarantees  about  the  correctness  of  the  result.  

•  Reliability  and  correctness  are  clearly  important  features  of  any  informa(on  system;    

•  They  are  par(cularly  important  if  ontology  based  systems  are  to  be  used  in  safety-­‐cri(cal  applica(ons  such  as  medicine,  where  incorrect  reasoning  could  adversely  impact  pa(ent  care.  

The  Seman(c  Web  &  Ontologies   61  

Page 62: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Standard  Query  Language  

•  It  has  long  been  recognised  that  the  seman(c  web,  and  seman(c  web  knowledge  representa(on  languages  such  as  RDF  and  OWL,  would  also  benefit    from  the  availability  of  a  standardised  query  language  such  as  SQL  

•  A  W3C  standardisa(on  working  group  was  set  up,  and  has  completed  its  work  on  the  SPARQL  query  language  standard.  

The  Seman(c  Web  &  Ontologies   62  

Page 63: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

SPARQL  Protocol  and  RDF  Query  Language  …  

•  …  is  an  RDF  query  language,  ie  a  query  language  for  databases,  able  to  retrieve  and  manipulate  data  stored  in  RDF  format  

•  SPARQL  allows  for  a  query  to  consist  of  triple  paOerns,  conjunc(ons,  disjunc(ons,  and  op(onal  paSerns  

The  Seman(c  Web  &  Ontologies   63  

Page 64: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Tags  &  Ontologies  

•  Tagging  facili(es  within  Web  2.0  applica(ons  have  shown  how  it  might  be  possible  for  user  communi(es  to  collabora(vely  annotate  web  content,  and  create  simple  forms  of  ontology  via  the  development  of  hierarchically  organised  sets  of  tags,  oeen  called  folksonomies….    

The  Seman(c  Web  &  Ontologies   64  

Page 65: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Challenges  

•  Currently  hard  to  combine:    –  Increased  expressive  power  (by  using  more  sophis(cated  logics)  with  scalability  (large  ontologies)  

The  Seman(c  Web  &  Ontologies   65  

Page 66: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Ontology  Learning  •  Ontology  learning  (ontology  extrac(on,  ontology  genera(on,  or  ontology  

acquisi(on)  is  the  automa(c  or  semi-­‐automa(c  crea(on  of  ontologies,  including  extrac(ng  the  corresponding  domain's  terms  and  the  rela<onships  between  those  concepts  from  a  corpus  of  natural  language  text,  and  encoding  them  with  an  ontology  language  for  easy  retrieval.    

•  As  building  ontologies  manually  is  extremely  labor-­‐intensive  and  (me  consuming,  there  is  great  mo(va(on  to  automate  the  process.  

•  Typically,  the  process  starts  by  extrac(ng  terms  and  concepts  or  noun  phrases  from  plain  text  using  linguis(c  processors  such  as  part-­‐of-­‐speech  tagging  and  phrase  chunking.  Then  sta(s(cal  techniques  are  used  to  extract  rela(on,  oeen  based  on  paSern-­‐based  or  defini(on-­‐based  hypernym  extrac(on  techniques.  –  hSp://en.wikipedia.org/wiki/Ontology_learning    

The  Seman(c  Web  &  Ontologies   66  

Page 67: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Ontology  Mining  

•  At  the  intersec(on  of  computa(onal  linguis(cs  and  the  seman(c  web,  there  is  a  community  on  ontology  learning/mining    

•  Paul  Buitelaar  and  Georgeta  Bordea  in  Ireland:  

– hSp://nlp.insight-­‐centre.org/    

The  Seman(c  Web  &  Ontologies   67  

Page 68: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

So,  how  are  you,  Seman(c  Web?  

The  Seman(c  Web  &  Ontologies   68  

Page 69: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Dead  or  Alive?  

•  Yahoo  Researcher  Declares  Seman(c  Web  Dead  (2007)  •  hSp://searchenginewatch.com/sew/news/2056255/yahoo-­‐researcher-­‐declares-­‐seman(c-­‐web-­‐dead    

•  Three  reasons  why  the  Seman(c  Web  has  failed  (2013)  •  hSps://gigaom.com/2013/11/03/three-­‐reasons-­‐why-­‐the-­‐seman(c-­‐web-­‐has-­‐failed/    

•  Seman(c  Web  Is  Dead,  Long  Live  Structured  Web!  (2014)  •  hSp://www.techweekeurope.co.uk/workspace/import-­‐io-­‐structured-­‐web-­‐141768  

•  etc.  

The  Seman(c  Web  &  Ontologies   69  

Page 70: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

W3C  (World  Wide  Web  Consor(um):    Seman(c  Web  Official  web  page  (2014)  •  hSp://www.w3.org/standards/seman(cweb/    

The  Seman(c  Web  &  Ontologies   70  

Page 71: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Further  Reading:    Morgan  &  Claypool  Series  on  The  Seman(c  Web  

The  Seman(c  Web  &  Ontologies   71  

Page 72: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

Present  and  Future  

The  Seman(c  Web  &  Ontologies   72  

Page 73: Semanc (Analysisin Language(Technology( - Marina Santinisantini.se/teaching/sais/2014/09_SemanticWebOntologies.pdf · Spoken&Language& • Spoken&language: incorrect&syntax,&incorrect&morpology,

The  end  

73  The  Seman(c  Web  &  Ontologies