Blair Jennings

High End Computing, Semantic Web, .NET Oh My!

<July 2008>
SuMoTuWeThFrSa
293012345
6789101112
13141516171819
20212223242526
272829303112
3456789


Navigation

Subscriptions



Moving

This blog is moving to DevAuthority.com

 

I want to thank Donny and all of the Dotnetjunkies crew for hosting me for the last year.

 

blair

posted Tuesday, June 28, 2005 9:27 AM by blairj

Why the semantic web lost the war for developers

Ok back to my normal musings on life the web and everything ;)

     Thanks Douglas Adams

Well I have been trying to figure out why RDF and all of the other acronyms for the semantic web have not become household words and my answer is that it takes way too long to understand the technology. And once you think that you understand it a curve ball gets thrown to you and changes everything. So I havefound a simpler solution in the form of a different way to do the same thing instead of trying to define all of the relationships your data will have on the fly (like RDF and OWL do) set up a system to limit your vocabulary. I settled on WordNet which is basically an electronic dictionary / thesaurus and encyclopidia all in a C# wrapped API (the API needs some work but it does do what it is supposed to). How I plan to use wordnet is simple let the dictionary define my limited vocabulary (if it is not in the dictionary tough:); then using wordnets Holonym and hypernym trees define relationships between related words going either more abstract (holonym) or more specific hypernym. By using these basic relationships I can make sure the data always stays in some form of context; because I tag the data with the right context just as the data enters my application. So I can always pull out it and its relatives just with the context tag making RDBMs searches that much more specific.

 

Until next time,

Blair

posted Tuesday, February 22, 2005 7:04 PM by blairj

Comments are off

Well it finally happened to me I got found by the comment spammers; hopefully someone will figure out how to stop those @#$%$^&?!~ people soon. So until that happens comments are off.

But until that happens I am going to keep on talking about the tablet PC. I had a great time at the windows anywhere conference. In fact now I am writing on my own fujitsu lifebook. I won it on the last day of the conference, and truthfully I kind of like it better than the Toshiba M200, I use at work. I am actually using the pen and TIP panel to write this blog entry. Some thing I have tried with the M200 but have yet to get better than 70% recognition with,  but this machine is doing around 90% accuracy in fact I am writing faster than I can type.

Until next time,

Blair

posted Friday, February 11, 2005 8:59 PM by blairj

Windows Anywhere Conference Notes

Ok I have been here in sunny San Francisco for a couple of days and this is the current break down:

Paul Mooney is talking about how the TabletPC just amazes him; I have to agree the machine is amazing and after watching Charles Petzold make his Portege M200 do some very amazing things in the graphics area of use for the new Real Time Stylus APIs (most with less than 200 lines of C#). I need to say that if these machines do not start to really take off soon a major oppurtunity will be lost.

Paul also asks why Tablets have yet to really grow in market size and after seeing the hardware talk yesterday I have to agree with the speaker (I can not remember his name but he is the hardware engineer behind the tablet); that currently the big problem with the tablet for the consumer market is the lack of a optical drive. Consumers want the machine for entertainment mostly; and this is about to be fullfilled. In the next couple of months a new batch of tablets are going to hit the market and they have integrated optical drives, so now the only thing that needs to happen is that the price needs to come down some and the market will explode.

Now as to programming the tablet I am very interested in how the tablet group has made it possible for the developer to expand the ablilties of the machine you truely have the ability to grab the data straight from the digitizer and mainpulate it how ever you want. I will be posting some example code for playing with the real time stylus in the next week (hopefully:). Charles Petzold definately made me think of new ways to extend the functionality of this machine.

 

until next time

blair

posted Wednesday, February 09, 2005 5:45 AM by blairj

The Tablet and Mobile PC Developers Conference

Is at VS Live Feb 6 - 10th, I will be there. Who else is going?

 

Blair

 

P.S. If their is enough interest maybe a Geek / Nerd Dinner could be scheduled.

posted Sunday, January 30, 2005 7:02 AM by blairj

Do you trust your data source?

Rockford Lhotka says in this TSS article that the data layer should be considered an external actor. Well I agree with him because of these reasons:

1) if the database is not in your control who says that the DBA will not change the schema one day breaking some but not all of the applications written against it. This does happen for perfectly good business reasons, i.e. new laws, a new business rule, or some other reason.

2) with the advent of SOA it is now possible to have failover of different datasources which have simular data the only real difference is how the data is stored &/or formatted. One DB sends its data  out in an old text format while another sends it out in XML how do you merge them especially if you do not know that the failover occured (think same web service different DB behind) the WS still sends out a string just the format has changed and you did not do anything.

3) with new applications using p2p technologies it is very possible that the same data can and will be found but the difference is in the XML dialect the data is living in (XML dialect same data different xml tags). The issue here is that things occur dynamically the machine does not know what it will get until it gets it i.e. “real time”

All three of these reasons are things which we need to be thinking about now; how do you deal with dynamically generated / found / produced data of which you probably do not know where it came from or even what format it is in. We need to really start dealing with this issue because it is only going to get worse with the advent of easy p2p systems especially over IPv6 (multicasting is just too easy there), WS-Discovery which while it currently is positioned for small connected devices can easily be used to find disparate web services with out the need for a centralized registry server.

These issues are coming down the road at a very fast pace and we need to start determining how to deal with them now or else we will get overwhelmed by the gigantic size of the problem. While Rocky does not give any answers he does raise some interesting points like: almost every data source is outside of the applications trust boundary (I can think of some which this is not true but they are a very small percentage of the applications in the wild), and any interactive interface whether it is a data source or a GUI is outside of the trust boundary you can not trust what happens there (an interesting coorelation all of the projects I have been on ~80% of the bugs in the system were in the layers which interacted with these external actors not the layers which were within the trust boundary). This is an area which everyone needs to start thinking about especially if you are doing SOA because you will be bitten by them maybe not now but soon.

Until next time.

Blair

posted Thursday, January 20, 2005 6:47 PM by blairj

Musings about my TabletPC

<Test of Ink Recognition>

Her Eugene This is a test of Inking a my Fact pc were it-rad Beatably right

</Test of Ink Recognition>

I actually wrote:

Hello everyone. This is a test of Inking on my TabletPC.

Well I have to say that I think I need to train this machine (a toshiba Portege M205) better on my handwriting (now I need to figure out how:) 

To be honest my hand writing sucks (it has been compared to an Md's scrawl on a drug prescription pad way too often:). So usually when I want text recognition I block letter but that is not fast like handwriting is. A lot of the things I use this machine for tend toward graphics; add hock class diagrams, system flow diagrams, add hock white board stuff. I really like it for that functionality. Also I love OneNote except for the fact that you can not get the info back out of it there is no way to export the data back out of OneNote programmaticaly ( I have a bunch of programs I would love to write using a large amount of the functionality of OneNote but until there is an API to move data in both directions they will not happen). 

The other use I have for this machine is that it makes for a great secondary development box (I have VS 2005 and the Avalon CTP on it); I love VS 2005 it is a large step forward in the productivity arena. The refactoring support is good(their are some more refactorings I would love to see it do but the ones provided are good (a possible enhancement is to allow end users to write our own refactorings (I wonder how hard that would be)).

Well I hope to be able to blog more from now on (I made my major deadline Jan.5) back last Friday Yeah!!! Now I will have more free time to really start writing ( I want to write a couple of articles over the holidays about the things I have been researching (Avalon, Indigo, the semantic web stuff, and how to tie all of them together).

Well I need to go I promised my oldest I would take her shopping (UGG) oh well. Until next time.

posted Friday, December 24, 2004 7:05 AM by blairj

Enterprise Integration Patterns part 2

 Well I have implemented the first set of Data Translators in my Normalizer (for those who do not know a Normalizer is made up of a Message Router and a set of data translators); it turns out that I had already put a Message Router into my application so all I needed to do is to add the translators. So now I have come to the conclusion that while this is all well and good what I really need is a generic Catonical Data Format ( I really did not want to try to do this; I mean how do you write a format which can represent N different types of data which is represented in (N-X) different formats).

So I decided to go and look if anyone had tried to make a generic data format (XML based preferably); and what do I find but 4 different projects to do just that. They are: NASA's XDF and CDF projects; NCSA's HDF; and the Global Grid Forumns DSDL. Well this made my day all I need to do is figure out how to convert my data into one of these formats and away we go right. Wrong first DSDL is the ultimate in vaporware, the only thing that has been done in that group is argue about what DSDL should be for the last three years. Ok so that leaves three others let's look at them: HDF has some great ideas (Hierarchical data for one) but it is mired in code from the early '90s all of the major functionality is in C not C++ or even Java but C. This makes converting the code to .NET problematic (I have tried to port C code before and some of the concepts in C just do not have an equivalent in .NET or Java for that matter). OK that is fine maybe they have an XML representation defined by a schema so I can deserialize it into an object set; nope just one defined by a DTD; while I could try and reverse engineer a schema from the DTD I have found that to be problematic. 

OK so now it is time to look at CDF and it's relative XDF; first thing CDF is old it was started back in the late '80s to deal with the pleiferation of data formats coming out of space science (that sounds familiar (I am dealing with the same problem only in biology and chemistry over 10 years later)). Well now let's look the last update was 2 years ago and it seems to have been a merger between XDF and CDF which allows XDF to represent CDF data as XML. OK thats fine so let us go look at XDF it is really just an XML schema for generic data and low and behold it uses an very well written and thought out XML schema. I can use this. It took me 2 hours to get SQl Server 2005 to accept data into a strongly typed XML column and be able to push and pull XML serialized objects in and out of the DB in XDF. This was way too easy.

posted Saturday, November 13, 2004 2:19 PM by blairj

Enterprise Integration Patterns

Ok I am finally getting around to reading Enterprise Integration Patterns (I bought it back in May at Tech Ed San Diego and it has sat on a shelf until 2 weeks ago (Ugh)). Well I am looking at the Messaging patterns and I have come to the amazing conclusion that a lot of the problems which the “semantic web” people have been trying to solve have already been mostly solved back in the old days of MOM. 75% of the problem is how do you deal with data (i.e. Messages) that arrive in multiple different formats, and then use that data in such a way as to allow your application to not become a maintenance nightmare.

So now that I know how to deal with my basic problem the issue of what common data format am I going to convert to. So far I am thinking that instead of converting to one Meta-Format (probably an impossible thing to do anyway; no format can deal with every different kind of data in the multiple different scientific arenas(my specific domain)). I think instead I will pick out a set of already defined XML standards and just convert semantically equivalent data formats to the equivalent well defined standard; while this will not eliminate the fact that I will have multiple different data representations in my application, it will minimize the number of different formats in the system.

Well hopefully this will actually work (thank goodness this is a research project and not a production app:); if it does then I will write a full article on this subject.

posted Thursday, November 04, 2004 7:41 AM by blairj

Xaml, late bound plug-ins, poor documentation and old documentation

OK well I know it has been a while since I last blogged, but hey sometimes life gets into the way (I took a great 2 week vacation to Ireland, instituted a new persistence system into the major project I run, and played with my kids in what little free time I've had).

<rant on>

OK on to the topic, I have been using Xaml since last May (mostly using MyXaml but keeping an eye on Mobiform, Xamlon and Microsoft). To give you an idea of how I am trying to use Xaml the program I wrote is trying to make a more powerful plug-in system than what Windows.Forms can do now. This application currently uses a late bound plug-in system to allow it to be extended in different ways (read new data source begets new plug-in(s) to deal with the data). Now I had a system using MyXaml which allowed me to load a Xaml + codebehind into a System.Windows.Forms.UserControl and deal with the plug-ins as separate units of work. Now I have tried to do this with the existing code bases out on the web and well I can not seem to separate the Xaml parser from the main form for the application. This is not acceptable I need a system where I can load the Xaml file + codebehind on the fly (sometimes I maybe loading the files from a Web service or even a local database (Oh my) this is an idea I am playing with for distributing plug-ins, more on this in a different post). Now for the last two weeks I have been trying out the latest versions of parsers on the web and unfortunately I need to say that MyXaml still has them beat in how to make a working GUI. Not that what they have achieved is not significant it is; but from my perspective (I need to create standard Gui's and yes they need to do things like 3D visualization and show pictures (it is a scientific application and needs to show chemical structures, topo maps, etc.) they just do not quite cut it. Yes the graphics are there now let's get the integration with a standard application done.

 

OK now on to documentation: this issue is with what is still out on the web most of the documentation for Xaml is from last years PDC for ch#$$ sake; most of it has no basis in reality today (I have been trying to find some info on the Mapping PI but the only person who seems to have used it is Don Box) Arrgh!!! All I want is to find out how many different attributes a map pi can have besides xmlnamespace, namespace, and assembly. But all of the documentation I find does not even state that the mapping tag even exists! Now as to poor documentation I would really love it if someone would write an article about Xaml which did not use inline code! How about a nice codebehind file, separate the controller from the view people (my model lives a long way away from either the view or the model(for those who do not know what I am talking about go read up on the MVC pattern)).

</rant off>

 

OK well I needed to get this off my chest (I have been very frustrated for the last 2 weeks and well now that I thought the problem was licked I found out that, no, the system was actually early bound). UGG.

posted Wednesday, October 06, 2004 5:30 PM by blairj

The problems with converting unit tests between platforms

I am working on a project which basically is converting a java library to C#. While the JCLA did an OK job doing the monkey conversion (all of the boring syntax stuff) of the main code base it really messed up the jUnit tests. While I have been using TDD for a rather long time (started with jUnit now use NUnit) going on 5 years now I never really liked jUnits ideas about TestCases and TestSuites. They always seemed to be more work than they were worth. I like to just write the tests and let the test runner figure out how to report things (with NUnit this is a very natural way of doing things).

 

So my problem is that all of the java unit test code which has been converted, uses the TestSuite idea in such a way as to actually make it hard to understand what is being tested and the purpose behind the tests is highly obscured. I have actually come to the conclusion that rewriting the tests will take less time and be more coherent in what is trying to be done.

 

The other thing I really hated about the java code was the fact that all of the test code was intermixed with the main code base. Arrgh this drives me up a wall not only does it cause unnecessary code bloat; it also makes build scripts overly complex and hard to maintain. Causing strange bugs to appear like the fact that the different code bases (main, unit tests, and console app) had a rather nasty circular dependancy (unit test -> main code -> console app -> unit tests). Breaking this gave me a headache. I am going to go take two asprin and I will see you in the morning:)

posted Wednesday, August 11, 2004 11:44 AM by blairj

The issues with dynamic discovery of Web Services

Well I am working on a system which will discover new web services via a p2p multicast system, i.e. the next evolution in SOA architectures(see the WS-Discovery proposal). Well my biggest problem is not with finding the Web Services (WS); it has to do with all of the different XML dialects coming out of the services I am developing against. These WS's all do simular things to scientific data but each one has decided to develop their own XML format in order to deal with their small corner case which the accepted standard does not address right in their estimation. Well after pulling my hair out for a couple of weeks, I have come up with an elegant solution map the XML schemas to a Taxonomy (i.e. a restricted vocabulary). Then I can figure out how to deal with the differences in the XML by having a concordanace table in my database which maps simular tags, and all I need to know is which tag is the right one for this XML.

 

 

Listening to: KFOG online

posted Monday, August 09, 2004 12:31 PM by blairj

The issues with upgrading

I am working on a small project using the new visual C# express and SQL Server Express. Well I just upgraded SQL Server Express to the new beta which meant I needed to upgrade the .NET SDK also so I did. Now that went off perfectly fine everything works except when I went to rebuild my app I got a boat load of compile errors. It seems that my project (something I have been working on and off for a while); had a bunch of references to earlier frameworks. I mean OK no problem let's go fix the issues then I noticed that some of the refernces in the same project were to the right version of the framework and others were to an earlier version. OBoy I do not know how that happened; somewhere the references got lost / mixed up / something. Oh well live and learn.

posted Monday, July 26, 2004 1:54 PM by blairj

Hello Everybody!!!!

Well my first blog. Ok here goes: something about me. Well I work at one of the nations premier Supercomputer centers as Lead programmer on a .NET project. This project spans many areas like: High end grid computing, the semantic web, and how to make an easily extensible program in C#. More on each of those topics in later posts.

 

For now let's also go into my side projects out on gotdotnet I am the lead for the JenaNET project this project is a port of HPs Jena 2 semantic web API to .NET. We just started about a little over a week ago and it is proceeding nicely (I just wish I had more free time). I will go more indepth about how the semantic web can be used in everyday programming over the next few posts.

 

Ok well I need to get back to work.

posted Monday, July 12, 2004 6:51 PM by blairj




Powered by Dot Net Junkies, by Telligent Systems