Next up for Blackboard, automatic Json based workspace persistence. Rumor has it we are working on a new persistence layer built specially to support the Blackboard's persistence needs. Stay tuned!
Wednesday, October 14, 2009
Blackboard now in Beta!
Blackboard is now Beta! New version of Blackboard (beta-0.20) is now available for download. This version comes with programmatic workspace configuration support, Json workspace serialization, and automatic jar file loading for standalone operation.
Labels:
artificial intelligence,
blackboard,
business rules,
Java,
JavaScript,
JRuby,
json,
Ruby
Sunday, September 20, 2009
Cloud Computing and Hannibal Let You Be You
Recently, The Federal Government announced it will offer Cloud Computing to federal agencies through Apps.gov. As Werner Vogels, the CTO of Amazon, recently pointed out in his blog, the move comes as a result of the realization that there is too much effort spent on managing IT infrastructure. Amazon realized this years ago and created Amazon Web Services to help alleviate this cross organizational concern.
Cloud Computing solves this problem by turning Infrastructure into a utility. By abstracting the key problems and offering a simple way to manage IT Infrastructure allows an organization to focus on the real task at hand.
We have observed the same phenomenon of duplicated effort in creating software. We have taken the philosophy of automating wherever possible to solve many common problems of creating an application. Hannibal takes care of creating the core of a web application, from persistence to presentation.
We have observed the same phenomenon of duplicated effort in creating software. We have taken the philosophy of automating wherever possible to solve many common problems of creating an application. Hannibal takes care of creating the core of a web application, from persistence to presentation.
Hannibal enables you to focus on creating interesting applications instead of worrying about the common problems of creating an application. The same way Cloud Computing lets you focus on more important things besides IT Infrastructure.
Labels:
amazon web services,
Cloud Computing,
Hannibal,
S3
Friday, July 31, 2009
The Importance of Being Informative
If you are a public transit commuter in Washington DC, no doubt you are spending a fair portion of your time wondering about the state of our subway transit. I have been commuting via subway for a fair number of years now, and during that time I have often made observations about the dynamics of our subway, and how they relate to the design and architecture of applications that we build for the web.
It was not so long ago that WMATA installed status signs above the tracks in every subway station. As an example, I have included a photo of one of those signs.
This represents a big improvement over what had existed before, which was limited to a set of flashing lights at track level meant to warn you that a train was just arriving, but really nothing more than that. Without these signs, passengers would stand in the subway station without any clue as to when the next train will arrive.
As you can see the color of the line (RED), the size of the train in cars (6), the direction (GLENMONT) and the arrival time (ARR) is presented on these status signs. This is a good thing because it helps newly arriving customers that enter the train station understand how long they have to wait before their train arrives.
But I submit that this is not good enough. As a matter of fact, with just a little more thought metro could have done much better. Here are all the things I think are actually wrong with the current implementation:
- These electronic dynamic signs are being used to present mostly static information. Over half of the sign is used to present the name of the line (RD) and the direction of the train (GLENMONT), but for a given platform on our subway this never really changes! The train station is littered with signs that tell you the direction of the train on the track, and the train itself tells you the final station so having this information on a electronic sign is pretty much redundant. Displaying static information is not the best way to use a dynamic presentation sign.
- Although the "next train arriving information" is certainly relevant, it is not the most pertinent to a metro traveller. The status signs tell you how long you have to wait until the next car arrives, but is this really what you want to know? Or is there something more pertinent than this?
- The signs are in the wrong location. These signs provide important information about the state of the system, yet they are located far inside the metro. In most stations, most of these signs are not even visible from the areas where customers may purchase tickets.
So what would be a good way to represent this? What I am suggesting looks very much like the flight status boards you see in modern airport terminals, except instead of displaying departure times for trains leaving the station it should show the expected arrival times for all the stations in the system accessible from this track (direction) at this station.
And finally, the last change I would suggest is to make these signs viewable from the street entrances. Why? As a traveler I want to know the state of metro before I pay my fare. I should be able to decide whether or not I want to ride the train, take a taxi, or drive to my destination. This is very important because open systems with finite capacity (like subways and highways) cannot control the arrival patterns of new users. As a matter of course, systems like these should always provide real time performance capability feedback to its end users. In the case of Washington DC's subway, this would allow the system to recover from performance issues more gracefully. When the system is overloaded, some customers would not enter the train station if they were made aware of issues at the gate. Taxi cab drivers will also be able to use this information to pick up passengers that, due to the current overloaded state of the subway system, would not want to rely on metro to get them to their destinations on time.
I have addressed the three aforementioned issues, but there is still one issue that I have not handled. It is a matter of money. Metro has already invested quite a bit of money on the signs they have, and money in this climate is hard to find. How could WMATA implement the improvements I am suggesting at minimal cost to the taxpayers of DC?
Well, one way to do this would be to reuse the existing signs. There is a problem with this approach. The Washington DC metro has over 80 metro stations. To show all 80 of those stations using the existing status boards would be a challenge to say the least. How could we present this amount information given such a limitation in real estate?
What if WMATA used colored symbols (shapes) combined with a prioritized information set? One example of such a presentation could be to model this using a combination of popular station hops and system status colors. The picture below gives you an example of what one such sign would look like:
So instead of showing the end of the line information and the arrival time of the next train, we can show the expected arrival times for key stations in the system. What stations would qualify as key stations? This would depend on the station you are waiting in, but it would most often be transfer stations, and popular exit points. So if you get on the Red line train at Takoma station and you are heading downtown during the morning rush hour, the message board should give you the expected arrival times for Takoma, Union Station, Gallery Place, and Metro Center. For the vast number of travelers entering Takoma I would expect that this would be more than enough to determine whether or not the system is working well. The color coded symbols will tell you whether or not the expected arrival time is considered normal or abnormal.
In the sign above, the 7 minute delay between Gallery Place and Metro Center is considered below normal performance, and as such, a red heptagon (for color blind people a change in shape is helpful) indicates that the system is performing below expectations between Gallery Place and Metro Center.
In many ways, the information sharing challenges experienced by Washington DC's metro system are similar to many of the issues we architects experience in creating high performance web sites. Displaying accurate system status to end users is a design factor too often left out. This blog posting addresses this mistake in a highly used and very public transit system but the techniques used to solve those problems can be directly transfered to similar situations on the Web.
Labels:
amazon web services,
architecture,
arrival times,
data,
display,
feedback,
presentation,
subway,
system dynamics
Monday, July 27, 2009
Why Do We Continue to Join?
Just recently my wife and I built a new walk-in closet. When we were finished, in preparing to use the new space, I thought about how my clothes used to be packed and realized there were a lot of inefficiencies. As an example, I used to keep my boxers in a separate drawer from my undershirts. I had been doing this since my childhood. It meant that every day that I got ready for work I would open one drawer to get a pair of boxers, then open another drawer to fetch my undershirt. Thinking about it a bit, it seemed to be such a waste of time to keep them in separate drawers. Why didn't I just put my boxers and undershirts in the same drawer?
So that, along with other improvements, is exactly what I did. Now getting dressed in the morning is a little bit easier for me.
This is what I want to talk about today. Revisiting old data schema arrangements and asking ourselves whether or not this is the best schema for our product?
Most of the database schemas today, backing traditional CRM applications are designed in such a way that data entities belonging to the same customer record are stored in different tables. To bring those entities together to form the entire customer record, these same CRM applications rely heavily on relational database joins. Individual data entities are stored in separate tables, but are retrieved via joins to make up the customer record. This would make a lot of sense if these records were normally retrieved in parts, but more and more this is not the case.
Instead, the business expects us to retrieve customer records in whole. Now there is talk of the 360 degree view of the customer. Although there are varying opinions of what this means, it is generally understood that it means that all information about the customer should be retrievable via the customers unique identification number. If you think this sounds like a hashtable key, value relationship I will have to agree with you.
So why do we continue to store data in this manner? I believe that one word gives us the answer, and that word is "tradition". We have an entire generation of programmers, relational database administrators, and technical managers that believe that "good" database design always incorporates normalized entities that are connected via foreign keys to other entities. They think this without giving much thought to the way the data they are storing will actually be retrieved.
Instead should we not revisit the notion that customer records need to be separated into different data entities? I would submit that once this examination is completed, it will become apparent that relational schemas make little or no sense for the operational stores of modern CRM applications.
Does this mean that there is no place for relational databases? Of course not! Relational databases are great for creating reports and performing ad-hoc data analytics. As they are designed today, using a key value store for reporting would be painful at best. But for the types of canned data requests that modern day call centers produce, a key value store is tough to beat.
Instead, what this does mean is that more consideration towards using key value stores in the enterprise is needed. There are several that are becoming popular. MemCache makes a lot of sense when caching joined information by key, as does EHCache. Using this approach, this joined information can be stored in a persistent relational database, but once the expensive join is completed, the now flattened information from that join can be stored in memory and retrieved via any number of keys. Amazon's S3 remains a viable option, as does the open source distributed key stores like Voldemort, Cassandra, and Couch DB. Finally, Oracle's Berkeley DB bears some looking into as well.
So that, along with other improvements, is exactly what I did. Now getting dressed in the morning is a little bit easier for me.
This is what I want to talk about today. Revisiting old data schema arrangements and asking ourselves whether or not this is the best schema for our product?
Most of the database schemas today, backing traditional CRM applications are designed in such a way that data entities belonging to the same customer record are stored in different tables. To bring those entities together to form the entire customer record, these same CRM applications rely heavily on relational database joins. Individual data entities are stored in separate tables, but are retrieved via joins to make up the customer record. This would make a lot of sense if these records were normally retrieved in parts, but more and more this is not the case.
Instead, the business expects us to retrieve customer records in whole. Now there is talk of the 360 degree view of the customer. Although there are varying opinions of what this means, it is generally understood that it means that all information about the customer should be retrievable via the customers unique identification number. If you think this sounds like a hashtable key, value relationship I will have to agree with you.
So why do we continue to store data in this manner? I believe that one word gives us the answer, and that word is "tradition". We have an entire generation of programmers, relational database administrators, and technical managers that believe that "good" database design always incorporates normalized entities that are connected via foreign keys to other entities. They think this without giving much thought to the way the data they are storing will actually be retrieved.
Instead should we not revisit the notion that customer records need to be separated into different data entities? I would submit that once this examination is completed, it will become apparent that relational schemas make little or no sense for the operational stores of modern CRM applications.
Does this mean that there is no place for relational databases? Of course not! Relational databases are great for creating reports and performing ad-hoc data analytics. As they are designed today, using a key value store for reporting would be painful at best. But for the types of canned data requests that modern day call centers produce, a key value store is tough to beat.
Instead, what this does mean is that more consideration towards using key value stores in the enterprise is needed. There are several that are becoming popular. MemCache makes a lot of sense when caching joined information by key, as does EHCache. Using this approach, this joined information can be stored in a persistent relational database, but once the expensive join is completed, the now flattened information from that join can be stored in memory and retrieved via any number of keys. Amazon's S3 remains a viable option, as does the open source distributed key stores like Voldemort, Cassandra, and Couch DB. Finally, Oracle's Berkeley DB bears some looking into as well.
Wednesday, July 22, 2009
Latest Hannibal Release Sports Persistence via Amazon's S3 Key Store!
Now you can use S3 as a persistent store when using Hannibal. The wiki has all the details.
Hannibal Version 0.40 Release 593 just Uploaded.
This release contains numerous bug fixes as well as support for Amazon's S3 persistence. In addition, developers may now use the Hannibal's S3Realm.
Here are the release notes.
Here are the release notes.
Wednesday, June 10, 2009
Java's Lists, Sets and Maps with Hannibal! Oh my! What have we done!?
Manipulating Java Collections using Hannibal can be described with one word. Comfortable! Check out this wiki entry on using Java collections in JavaScript with Hannibal.
Hannibal makes Testing ReST easy
The Browser class in Hannibal makes testing ReSTful Web Services easy. Simply call get, put, post, delete, or options and pass in a url and if needed a parameter map. The code would look something like this:
var response = browser.delete(myUrlString)
browser.post(myUrlString, [parameterName: "someParameterValue"])
Then get the response from the web service call and make assertions. Take a look at the unit tests from our SvnServices project for more examples.
Look for more tips on using Hannibal in the future.
Labels:
amazon web services,
groovy,
Hannibal,
restful,
SvnServices,
testing
Wednesday, June 3, 2009
Hannibal is in the Cloud!
We just pushed code generator support for Amazon's S3 key value store into Google Code. Now developers can generate data access objects that can persist data to S3. Active records are serialized to JSON then published to an S3 bucket named after your application's name. Retrieving data simply involves changing the JSON string back to an active record using the fromJson() method.
Here is what a URI path for a marathonracing's race domain object in S3 would look like
marathonracing/race/cherryblossom
where marathonracing is your S3 bucket name and race/cherryblossom is your S3 object.
Very simple.
We are testing this functionality now. Expect to see this in the next major release of Hannibal.
Here is what a URI path for a marathonracing's race domain object in S3 would look like
marathonracing/race/cherryblossom
where marathonracing is your S3 bucket name and race/cherryblossom is your S3 object.
Very simple.
We are testing this functionality now. Expect to see this in the next major release of Hannibal.
Labels:
amazon web services,
cloud,
Hannibal,
json,
S3
Thursday, May 21, 2009
Svn Services
Lucid has released Svn Services, an open source project that provides restful Web Services using SVNKit and Hannibal.
You can use Svn Services to manage your Subversion repositories via web services. Using Svn services you can create or remove new Subversion repositories. You can manage your Subversion repository users as well. Svn Services is deployable as a war file in Tomcat.
You can download the alpha war file here.
You can use Svn Services to manage your Subversion repositories via web services. Using Svn services you can create or remove new Subversion repositories. You can manage your Subversion repository users as well. Svn Services is deployable as a war file in Tomcat.
You can download the alpha war file here.
Labels:
Hannibal,
Java,
JavaScript,
rest,
restful,
web services
Web Services are Simple With Hannibal
We got a few questions about the "Races and Runners" example so we updated the Hannibal example to explain how web services can be created with Hannibal.
The end result? Using Hannibal, you can create a web service that collects, validates, and persists a Race domain in just 3 lines of code!
Go have a look at this page on the wiki.
Wednesday, May 20, 2009
Hannibal is Simple Too
So we were just reading an interesting post over at The Disco Blog regarding the simplicity of creating web services using Grails. Grails is a very popular framework for creating web applications. After reading that article we were convinced that creating a web service using Grails was indeed simple; of course we wondered how Hannibal stacked up to that ...
So we decided to implement that "runners and races" example in Hannibal. Please visit the wiki for a treatment!
Labels:
grails,
Hannibal,
Java,
JavaScript,
rest,
restful,
web services
Friday, May 15, 2009
Lucid releases Blackboard for event processing, system integration, and workflow.
Lucid just recently created a new open source project called Blackboard. Blackboard is smart event processing framework patterned of the long established artificial intelligence blackboard design pattern.
Users can submit events to the blackboard where pre-defined plans execute against those events. Although the framework is written in Java, developers write plans in JavaScript and JRuby.
Using the blackboard is simple. It is packaged as an executable jar file that can be run from the command line. Blackboard can also be embedded in web applications as well.
We expect to use Blackboard to quickly implement high level business rules. This means that Blackboard could form the core of applications that process customer orders, perform complex financial calculations, create interesting human workflow applications, or even perform backend data integrations.
We will be posting documentation on how to use Lucid's Blackboard to the wiki over the weekend.
We are very excited about the addition of this tool to our already large programming toolkit. Of course since it is open source it is now available to you as well.
Labels:
ai,
artificial intelligence,
blackboard,
business rules,
event processing,
integration,
JavaScript,
JRuby,
Ruby,
scripting,
workflow
Thursday, May 14, 2009
Hannibal S3 support
Deploying Tomcat in Amazon's cloud just got a little easier! Tomcat provides a JDBCRealm that allows developers to access user password and role information in their database. Lucid extended that concept by creating an S3Realm that allows developers to retrieve user credential and role information from S3.
Configuration of an S3Realm is easy. Just create an entry in tomcat's server.xml that identifies your application credentials bucket name, along with your AWS secret and access keys. That's it.
Of course the appropriate S3 bucket should contain the user principal and role information as well. Developers are still responsible for populating this as users register.
This functionality is undergoing testing right now and will make the next release of Hannibal! Once it does we will provide the appropriate documentation in the Wiki.
Configuration of an S3Realm is easy. Just create an entry in tomcat's server.xml that identifies your application credentials bucket name, along with your AWS secret and access keys. That's it.
Of course the appropriate S3 bucket should contain the user principal and role information as well. Developers are still responsible for populating this as users register.
This functionality is undergoing testing right now and will make the next release of Hannibal! Once it does we will provide the appropriate documentation in the Wiki.
Wednesday, May 13, 2009
Open Data
When Vivek Kundra was the CIO of the District of Columbia, he championed the idea of a more open government. His thought was that by making the District's data more open, both DC's government and its citizens would inexorably benefit from this in unpredictable ways.
One of the many data opening initiatives he encouraged was Apps For Democracy. This was a project that encouraged private citizens to explore opportunities to mash up government data in ways not thought of by the District itself. The purpose of this project was to provoke a few thoughtful responses from the crowd of application developers with free time to engage in such pursuits.
In just over a month, Apps For Democracy received 47 useful data mash up applications. Why was this so successful?
We think it was successful simply because the District data was made accessible to people who cared about it. And the people who cared about the data did what they wanted to do with it. We call this approach, "The Client Knows Best". Give the client access to the raw data, get out of the way, and watch them produce what they want!
What would happen if this was done in the enterprise? Our own observation is that large corporations do the opposite. Data is not even made fully open to the members of the organization itself. For various reasons, enterprise information is usually trapped within the systems of the business pillars that make up the organization. When these data sets were first established, little or no thought was given towards sharing this information with other departmental groups. Later, when it became apparent that data sharing would be a good thing to have, attempts to make the data more accessible are hamstrung by the initial lack of forethought.
Complicating this is a tendency to "over own" the data. This desire to control the presentation is understandable, but many times the old fashioned mechanisms for achieving data ownership are so restrictive, the consumers of the data are unable to get what they want in a timely fashion. If data is not open, this will always be true!
What do we mean by open data? Well, open data does not mean data that is available to anyone. It is to be expected that some data should not be shared with everyone due to privacy and other concerns; it is also expected, that for an individual that is deemed privy to a data set, the information should gracefully be made available to that individual. How do you accomplish this? Here is what we think the minimum requirements a data store must meet in order for it to be considered open.
One of the many data opening initiatives he encouraged was Apps For Democracy. This was a project that encouraged private citizens to explore opportunities to mash up government data in ways not thought of by the District itself. The purpose of this project was to provoke a few thoughtful responses from the crowd of application developers with free time to engage in such pursuits.
In just over a month, Apps For Democracy received 47 useful data mash up applications. Why was this so successful?
We think it was successful simply because the District data was made accessible to people who cared about it. And the people who cared about the data did what they wanted to do with it. We call this approach, "The Client Knows Best". Give the client access to the raw data, get out of the way, and watch them produce what they want!
What would happen if this was done in the enterprise? Our own observation is that large corporations do the opposite. Data is not even made fully open to the members of the organization itself. For various reasons, enterprise information is usually trapped within the systems of the business pillars that make up the organization. When these data sets were first established, little or no thought was given towards sharing this information with other departmental groups. Later, when it became apparent that data sharing would be a good thing to have, attempts to make the data more accessible are hamstrung by the initial lack of forethought.
Complicating this is a tendency to "over own" the data. This desire to control the presentation is understandable, but many times the old fashioned mechanisms for achieving data ownership are so restrictive, the consumers of the data are unable to get what they want in a timely fashion. If data is not open, this will always be true!
What do we mean by open data? Well, open data does not mean data that is available to anyone. It is to be expected that some data should not be shared with everyone due to privacy and other concerns; it is also expected, that for an individual that is deemed privy to a data set, the information should gracefully be made available to that individual. How do you accomplish this? Here is what we think the minimum requirements a data store must meet in order for it to be considered open.
- Published data taxonomy - The data's categorical structure shall be made readily apparent to prospective users.
- Robust search capabilities - A user of this data shall be able to perform free text queries against the data store.
- Fundamental CRUD operations - Reading data is fundamental, and so is creating, updating and deleting data. Provided the security credentials of the data user checks out, these functions shall be readily available.
- Change notification - Users of the data store shall have an avenue to understand how the data they have accessed has changed over time.
- Centralized but portable data production rules - Other enterprise systems shall have the ability to create and manipulate new data resources outside of the system of record. Strange as this may sound, enabling other systems to create resources outside of the system of record can result in large magnitude improvements in reliability. This shall be encouraged as long as consumers always consider the data sources system of record to be the single source of truth. It is expected that data resources created in this manner shall eventually be synchronized to its system of record.
Many data stores in large organizations consistently fall short of these goals. As a point of fact, we have never witnessed a departmental data store that had all these criteria met. This functionality is never built in from the inception. This is a shame, since the truth of the matter is that such functionality is actually quite easy to provide up front.
Our code generation suite Hannibal makes it really easy to have this from the inception. Hannibal allows developers to generate code that performs search, CRUD operations, and change notification out of the box. Furthermore, developers can stipulate declarative security rules that govern who is allowed to view this information. Finally, it makes all of this available via restful web services, so that applications from mainframe processes to desktop browsers can effortlessly access this data as well.
Now your data is open.
Now your data is open.
In the following posts, we will give real world examples of data sharing applications built using Hannibal, and illustrate how typical problems experienced in the enterprise can be easily solved by some creative combination of the five aforementioned bullet items.
Labels:
crud,
data,
data steward,
Hannibal,
integration,
rest,
search
Thursday, February 26, 2009
Hullo Everyone!
Any given problem can have many good solutions, but the very best solution usually depends on the context of the problem. Context can change over time, so solutions that work great today may not work well in the future. It is also true that solutions that didn't seem to make sense in the past, after fresh empirical re-evaluation, can become the best solutions today. A lot of our musings will involve challenging past assumptions with an eye on discovering better and more efficient ways to build great software.
We encourage feedback. Even though we will write about what we think we know well, we expect to learn much from the comments of our readers. Feel free to send us your own stories or ask us questions. We will do our best to work them in to our postings.
Speaking of things we know well, Lucid Technics was once primarily a Java shop. Recent advancements in the JVM have exposed us to the pleasure of programming with dynamic typed languages like Ruby, Python, and server side JavaScript. So now we like to think of ourselves as a JVM shop!
Lucid Technics is also the creator of the Hannibal code generation platform. We are using Hannibal to create applications for our clients. We expect to talk a lot about the architectural philosophies that drove the design of Hannibal, and whether or not those philosophies work in practice.
So please join us. This is going to be good!
Regards,
Bediako and Dave
Labels:
architect,
Bediako,
Dave,
Lucid,
Lucid Technics
Subscribe to:
Posts (Atom)