rdf - DBPedia local server giving strange results for different queries -



rdf - DBPedia local server giving strange results for different queries -

i trying list of wikipedia people much features possible machine learning problem.

i have setup local dbpedia server , increased limit various parameters somehow still unable desired results.

the desired output csv fo next format:

<person1>,<feature1>,<feature2>,<feature3> .......... , on <person2>,<feature1>,<feature2>,<feature3> .......... , on <person3>,<feature1>,<feature2>,<feature3> .......... , on ...and ...so ...on

can direct me towards right way ?

for example, when run queries, got blank result:

query:

select ?name ?birthdate { { select strafter(str(?person),"http://dbpedia.org/resource/") ?name, str(? birthdate) ?birthdate { ?person <http://dbpedia.org/ontology/person> . ?person dbpedia-owl:birthdate ?birthdate . } order asc(?name) } } offset 100000 limit 500

result: [[name]][[birthdate]]

but when run query, got 50000 number of rows less

query:

select strafter(str(?person),"http://dbpedia.org/resource/") ?name, str(?birthdate) ?birthdate, str(?birthname) ?birthname, strafter(str(? occupation),"http://dbpedia.org/resource/") ?occupation { ?person <http://dbpedia.org/ontology/person> . ?person dbpedia-owl:birthdate ?birthdate . ?person dbpedia-owl:birthname ?birthname . ?person dbpedia-owl:occupation ?occupation . }

result: <<50000 rows>>

strangely, query seems work (atleast upto number) -

query:

select ?s ?p ?o { ?s dbpedia-owl:person ; ?p ?o }

result: << 1051038 rows >>

my virtuoso.ini file:

[database] databasefile = /var/lib/virtuoso/db/virtuoso.db errorlogfile = /var/lib/virtuoso/db/virtuoso.log lockfile = /var/lib/virtuoso/db/virtuoso.lck transactionfile = /var/lib/virtuoso/db/virtuoso.trx xa_persistent_file = /var/lib/virtuoso/db/virtuoso.pxa errorloglevel = 7 fileextend = 200 ;maxcheckpointremap = 2000 maxcheckpointremap = 1362500 striping = 0 tempstorage = tempdatabase [tempdatabase] databasefile = /var/lib/virtuoso/db/virtuoso-temp.db transactionfile = /var/lib/virtuoso/db/virtuoso-temp.trx maxcheckpointremap = 2000 striping = 0 [parameters] serverport = 1111 litemode = 0 disableunixsocket = 1 disabletcpsocket = 0 ;sslserverport = 2111 ;sslcertificate = cert.pem ;sslprivatekey = pk.pem ;x509clientverify = 0 ;x509clientverifydepth = 0 ;x509clientverifycafile = ca.pem serverthreads = 20 checkpointinterval = 60 o_direct = 0 casemode = 2 maxstaticcursorrows = 500000000 checkpointaudittrail = 0 allowoscalls = 0 schedulerinterval = 10 dirsallowed = ., /usr/share/virtuoso/vad, /usr/local/data/datasets threadcleanupinterval = 0 threadthreshold = 10 resourcescleanupinterval = 0 freetextbatchsize = 100000 singlecpu = 0 vadinstalldir = /usr/share/virtuoso/vad/ prefixresultnames = 0 rdffreetextrulessize = 100 indextreemaps = 256 maxmempoolsize = 200000000 prefixresultnames = 0 macspotlight = 0 indextreemaps = 64 maxsortedtoprows = 100000000 ;; ;; uncomment next 2 lines if there 64 gb scheme memory free numberofbuffers = 5450000 maxdirtybuffers = 4000000 ;; [httpserver] serverport = 8890 serverroot = /var/lib/virtuoso/vsp serverthreads = 20 davroot = dav enableddavvsp = 0 httpproxyenabled = 0 tempaspxdir = 0 defaultmailserver = localhost:25 serverthreads = 10 maxkeepalives = 10 keepalivetimeout = 10 maxcachedproxyconnections = 10 proxyconnectioncachetimeout = 15 httpthreadsize = 280000 httpprintwarningsinoutput = 0 charset = utf-8 ;httplogfile = logs/http.log [autorepair] badparentlinks = 0 [client] sql_prefetch_rows = 100 sql_prefetch_bytes = 16000 sql_query_timeout = 0 sql_txn_timeout = 0 ;sql_no_char_c_escape = 1 ;sql_utf8_execs = 0 ;sql_no_system_tables = 0 ;sql_binary_timestamp = 1 ;sql_encryption_on_password = -1 [vdb] arrayoptimization = 0 numarrayparameters = 10 vdbdisconnecttimeout = 1000 keepconnectiononfixedthread = 0 [replication] servername = db-ip-172-31-24-242 serverenable = 1 queuemax = 5000000 [striping] segment1 = 100m, db-seg1-1.db, db-seg1-2.db segment2 = 100m, db-seg2-1.db ;... [zero config] servername = virtuoso (ip-172-31-24-242) [uriqa] dynamiclocal = 0 defaulthost = localhost:8890 [sparql] ;externalquerysource = 1 ;externalxsltsource = 1 ;defaultgraph = http://localhost:8890/dataspace ;immutablegraphs = http://localhost:8890/dataspace ;resultsetmaxrows = 10000 resultsetmaxrows = 1000000000 ;maxquerycostestimationtime = 400 ; in seconds maxquerycostestimationtime = 4000000000000000 ; in seconds ;maxqueryexecutiontime = 60 ; in seconds maxqueryexecutiontime = 600000000000000 ; in seconds defaultquery = select distinct ?concept {[] ?concept} limit 100 deferinferencerulesinit = 0 ; controls inference rules loading ;pingservice = http://rpc.pingthesemanticweb.com/ maxsortedtoprows = 10000000 [plugins] loadpath = /usr/lib/virtuoso/hosting load1 = plain, wikiv load2 = plain, mediawiki load3 = plain, creolewiki load4 = plain, im

please tell me in case missing out trivial result of these queries not create sense me.

it's hard isolate exact problem since doing number of different queries. if want isolate cause, best thing create little changes.

also: queries syntactically illegal sparql, makes hard justice going wrong. in particular way formulate 'as' aliases wrong - 1 thing should enclosed in parentheses, , secondly should not alias existing variable. example, rather like:

str(?birthdate) ?birthdate

you should doing like:

(str(?birthdate) ?bd)

apart that, in first query, setting offset value of 100000. presumably, not getting answers because there fewer 100000 results.

in sec query, getting 50000 results, presumably accurately reflects actual number of people matching criteria. again, query odd trying "re-bind" variables new value using 'as' aliasing command.

finally lastly query retrieves triples resources of type person. not surprising result far larger imposing no farther constraints. each row in result 1 property-value combination particular person.

i recommend have @ basic sparql tutorial, because think may missing of basics. sparql takes bit of getting used to, 1 time larn basics (like graph pattern matching means), should find far easier write own queries.

rdf sparql semantic-web dbpedia virtuoso

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -