Home > Cassandra, Database Development, Java Development > Installing and using Apache Cassandra With Java Part 3 (Data model 2)

Installing and using Apache Cassandra With Java Part 3 (Data model 2)

Want to read the earlier postings first?

Apache CassandraThe last time we talked about the various containers that are present within the Cassandra data model (for example, column, super column, column family, etc..). One part we didn’t talk about is the the sorting behavior. Unlike normal relational databases Cassandra has no capability of querying so you are not able to specify sorting when you retrieve data.

By default Cassandra sorts the data as soon as you store it in the database and it remains sorted. This gives you an enormous performance boost, however you need to think before you start storing data.

Sorting can be specified on the ColumnFamily CompareWith attribute, these are the options you can choose from (it is possible to create custom sorting behavior but we will cover that later):

  • BytesType
  • UTF8Type
  • LexicalUUIDType
  • TimeUUIDType
  • AsciiType
  • LongType

Each of these types threat the contents of your Columnsname as a different data type, for example, the LongType threats your Columns name as a 64 Bit long value.

So lets look at some examples, suppose we have a ColumnFamily defined where the CompareWith is set to LongType, the data before formatting would look like:

Columns
NameValue
“9″“Ronald”
“3″“John”
“15″“Eric”

Since we are using the LongType to sort the name attribute of the Columns the data will be stored in the following way:

<ColumnFamily CompareWith="LongType" Name="Authors"/>
Columns
NameValue
“3″“John”
“9″“Ronald”
“15″“Eric”

As you can see the ordering is now a natural ordering of numbers, Now if we would change the CompareWith so that we use the the UTF8Type then the result will be compared as UTF8 strings, which would result in the following ordering:

<ColumnFamily CompareWith="UTF8Type" Name="Authors"/>
Columns
NameValue
“15″“Eric”
“3″“John”
“9″“Ronald”

As you can see the result is completely different, so every type that can be used in the CompareWith attribute has it’s own behavior.

The rules of sorting not only apply to Columns but also to SuperColumns, in case of the SuperColumns we also need to specify a second sorting rule using the CompareSubcolumnsWith attribute.

Suppose we have the following construction, three SuperColumns with each containing three Columns, unordered:

SuperColumns
KeyValue
“Learning Cassandra Part 2″

Columns
NameValue
“Author”“Ronald Mathies”
“Visibility”“Public”
“Status”“Draft”
“Learning Cassandra Part 1″

Columns
NameValue
“Status”“Draft”
“Author”“John Edding”
“Visibility”“Public”
“Learning Cassandra Part 3″

Columns
NameValue
“Status”“Draft”
“Author”“Jason Bourne”
“Visibility”“Public”

Now when we add a UTF8Type ordering to the CompareWith and the UTF8Type to CompareSubcolumnsWith to the SuperColumnFamily we get the following result:

<ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type" Name="Posts"/>

SuperColumns
KeyValue
“Learning Cassandra Part 1″

Columns
NameValue
“Author”“John Edding”
“Status”“Draft”
“Visibility”“Public”
“Learning Cassandra Part 2″

Columns
NameValue
“Author”“Ronald Mathies”
“Status”“Draft”
“Visibility”“Public”

“Learning Cassandra Part 3″

Columns
NameValue
“Author”“Jason Bourne”
“Status”“Draft”
“Visibility”“Public”

In this example i used the UTF8Type for both the SuperColumn as for the Column within the SuperColumn, this doesn’t have to be the case, you can mix them using all the various sorting types. However it is not possible to have different sorting types on the same level, so it is not possible to use UTF8Type and the LongType for different SuperColumns in the same SuperColumnFamily, the same rule applies for Culumns.

Besides the standard provided sorting types it is also possible to add your own custom sorting types. To create these you need to create a Class which extends the org.apache.cassandra.db.marshal.AbstractType class. To use it in the configuration file you need to package your class in a Java Archive and add it to the /lib folder of your Cassandra installation. In the database configuration file you need to specifiy the fully qualified classname in the CompareSubcolumnsWith or CompareWith attribute. This makes the sorting capabilities even more powerful. In a later post i will show an example of creating a custom sorting type.

Currently i am already working hard on the next blog post which will cover the basics of using Thrift (Apache Cassandra’s client API).

  • Add to favorites
  • Digg
  • del.icio.us
  • DZone
  • Reddit
  • StumbleUpon
  • Slashdot
  • Tumblr
  • Twitter
  • FriendFeed
  • Facebook
  • Google Bookmarks
  • MySpace
  • Faves
  1. ibillguo
    March 28th, 2010 at 15:09 | #1

    It’s great, thanks a lot for sharing
    And waiting for your how to use thrift blogs

  2. Sekhar
    August 26th, 2010 at 11:57 | #2

    Hi

    I am a new Bie of Cassandra DB, say suppose i want to load all the records from a table, in the cassandra terminilogy how to load all the key values from a columnfamily, if you dont know the keys values.

    and one more thing any sorting kind of thing we can’t done at db level right, all has take care in the application only, so more burden in the application, any alternatives for this …..

    Thanks in Advance…. :)

  3. August 26th, 2010 at 12:13 | #3

    Retrieving all data from a ColumnFamily can be done by using the KeyRange class. Normally you would specify a starting key and an end key but they are not required. However you do need to specify a count as to how many rows you want to retrieve.

    You need to keep one thing in mind, until Cassandra 0.7 (and possibly the first beta’s 0.7) you cannot retrieve more data that can fit in memory at a time.

    About the sorting, Cassandra stores it’s data in a sorting manner, besides the default sorting types you can also create a custom sorting class. More information about this can be found here:

    Creating custom sorting types for Apache Cassandra

    Hope this answers your questions…

  1. March 24th, 2010 at 09:21 | #1
  2. March 28th, 2010 at 02:19 | #2
  3. March 28th, 2010 at 08:30 | #3
Did you know that it is also possible to register as a user? this enables you to create comments without constantly specifying your name, e-mail and captcha code. Register