Wandora Forum Index
Wandora WikiWandora Wiki   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups  SmartFeedSmartFeed   RegisterRegister   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

MARCXML to Topic Maps

 
Post new topic   Reply to topic    Wandora Forum Index -> Wandora Blog
View previous topic :: View next topic  
Author Message
akivela
Site Admin


Joined: 18 Sep 2007
Posts: 74
Location: Helsinki, Finland

PostPosted: Fri Jul 30, 2010 2:47 pm    Post subject: MARCXML to Topic Maps Reply with quote

Open data movement is taking steps forward here in Finland. Some time ago Helsinki city library [1] released all their library data as a huge MARCXML dump [2] using rather liberal license of Creative Commons Attribution-ShareAlike 1.0. I though it could be a nice thing to have the MARCXML dump converted to a Topic Maps format, and asked Topic Maps mailing list whether anybody has made such conversions [3].

It was a bit surprise to me that the question raised such a vivid debate resulting more than twenty submissions to the mailing list in a very short time. Thanks to all who shared their experiences. Another surprising observation was that there really was no MARCXML to Topic Maps transformation available except [4]. Thank you Maria.

After some investigations I decided to write the MARCXML to Topic Maps conversion from scratch. Reading peoples opinions and experiences about MARC, it became clear that one shouldn't try to make a semantic transformation but more likely a wrapper transformation. In other words, to try model MARCXML schema using Topic Maps. It took couple days to make a first draft of the conversion feature [6]. Then few more days to fix bugs and add some new features such as batch conversion [7]. The MARCXML to Topic Maps conversion feature will be part of next Wandora release published late August 2010.

Back to the MARCXML dumps of Helsinki city library. The data has been divided into 69 MARCXML dumps, each containing 10000 records. Having converted some of these dumps, it looks like 10000 records explodes to ~100000 topics and ~200000 associations. It is already clear one just can't make a single (memory) topic map out of all that data. Near 7 million topics and 14 million associations is just too much to Wandora. One option could be a database topic map. Another option, perhaps the best one is to divide the topic map also to 69 sub-topic maps, and leave merge to the user.

Now, what can you to with the Topic Maps conversion of Helsinki city library data then. Well, I found it very interesting to browse all that data using Wandora. Associative data model of Topic Maps makes it very enjoyable to float around the data using keywords, person names, organizations, etc as pathways to the next room. When the data is a topic map, user can also easily merge his/her own data to the whole, and of course export the data in some interesting formats. One friend figured immediately a solution where users can make libraries of their own, give recommendations, and simply thumb books they like. The road is open.

Kind Regards,
Aki

[1] http://www.helmet.fi/
[2] http://data.kirjastot.fi/
[3] http://www.infoloom.com/pipermail/topicmapmail/2010q3/008227.html
[4] http://www.infoloom.com/pipermail/topicmapmail/2010q3/008231.html
[5] http://www.springerlink.com/content/e2q56340t07g795w/
[6] http://www.infoloom.com/pipermail/topicmapmail/2010q3/008298.html
[7] http://www.infoloom.com/pipermail/topicmapmail/2010q3/008314.html
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Wandora Forum Index -> Wandora Blog All times are GMT + 2 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group