Home > Cassandra, Database Development, Java Development > Installing and using Apache Cassandra With Java Part 4 (Thrift Client)

Installing and using Apache Cassandra With Java Part 4 (Thrift Client)

Want to read the earlier postings first?

Apache CassandraSo now that we have covered some of the theoretical parts, we can start to put our knowledge of the data model and sorting behavior to some use. In this blog post i will start to explain how you can configure your ColumnFamilies, make a connection to the database and how you can store, retrieve, modify and delete data.

Cassandra uses the Apache Thrift framework as its client API, Thrift is a separate Apache project which is to put it simply, a binary communication protocol. It is not only used by Cassandra but by a lot of different projects. To get a nice overview of the various companies / projects that use Thrift take a look at their wiki. Thrift is also available for a large number of development languages, ranging from Java and C++ to Python, PHP and many other software development languages.

Thrift also lacks some of the features that are quite import if you are going to use Cassandra, as we know Cassandra supports multiple nodes where it basically doesn’t matter to which node you connect. This has a big advantage that when a node is failing you can just connect to a different node and continue your work. However, Thrift doesn’t support this feature out of the box. Another important feature is connection pooling, Thrift doesn’t do this for you so you need to take care of this yourself. This isn’t because the authors of Thrift still need to add this these features, it is just not the purpose of Thrift. Thrift is just for the communication between software, not every software product that uses Thrift needs these capabilities and it would just create overhead if they were added.

However, there are a few Cassandra clients around that do add this capability along with some other nice features to make your life easier. Personally, i think Hector is one of the best at the moment, it add’s the features i mentioned and a lot more. It is an open source project with the MIT license and it’s hosted on GitHub.

For now, we will focus on only using the Thrift client, learning the basics makes it easier to determine which client supports your needs the best, and if things cannot be done using a client you can always fallback on using the Thrift API directly.

Before we can start with the client we first need to configure the Cassandra database so that it has the appropriate ColumnFamilies that we are going to use. In the CASSANDRA_HOME/conf folder is a configuration file named storage-conf.xml. Open this file in an editor and find the Keyspaces element, by default there are already two keyspaces, one is the system keyspace which is not defined in the XML file but is present within Cassandra, that one is needed for the internals of Cassandra. The other one is the Keyspace1 keyspace, this one is not used by Cassandra and only serves for demo purposes.

Between the existing Keyspaces element place a new Keyspace element:

<Keyspaces>
<Keyspace Name="Blog"> <!-- Add this line -->
</Keyspace> <!-- Add this line -->
</Keyspaces>

Now that we have defined our keyspace we can start to add some ColumnFamilies:

<Keyspaces>
  <Keyspace Name="Blog">
    <ColumnFamily CompareWith="UTF8Type" Name="Authors"/>
    <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" Name="Posts"/>
    <ColumnFamily CompareWith="UTF8Type" Name="Tags"/>

    <!-- Necessary for Cassandra -->
    <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
    <ReplicationFactor>1</ReplicationFactor>
    <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
  </Keyspace>
</Keyspaces>

So now we have three ColumnFamilies, one to store the information about the Authors, one to store the Posts that the Authors have created, and finally one that will contain the information about the Tags. The reason for the Tags ColumnFamily will get clearer in the following part, for now we only need to know that the Tags ColumnFamily will keep the associations between the Tags and the Posts. A post can have a number of tags and a tag can have a number of posts.

The last three lines are necessary for Cassandra to operate, these are set to the default values and we will keep them like this.

The Authors ColumnFamily will organize the data using Columns, the following is a representation of the ColumnFamily in a sort of hierarchical way:

ColumnFamily: Authors
KeyValue
“Eric Long”
Columns
NameValue
“email”“eric (at) long.com”
“country”“United Kingdom”
“registeredSince”“01/01/2002″
“John Steward”
Columns
NameValue
“email”“john.steward (at) somedomain.com”
“country”“Australia”
“registeredSince”“01/01/2009″
“Ronald Mathies”
Columns
NameValue
“email”“ronald (at) sodeso.nl”
“country”“Netherlands, The”
“registeredSince”“01/01/2010″

As you can see it is quite simple, the key is set to the name of the author and the value is de information about the author. The sorting rule we applied to the ColumnFamily is the UTF8Type, so the sorting is done on the name of the author in a alphabetically order.

The Posts SuperColumnFamily will organize the data using SuperColumns which contain Columns:

SuperColumnFamily: Posts
KeyValue
“cats-are-funny-animals”

SuperColumns
KeyValue
“post”
Columns
NameValue
“title”“Cats are funny animals”
“body”“Bla bla bla… long story…”
“author”“Ronald Mathies”
“created”“01/02/2010″
“tags”
Columns
NameValue
“0″“cats”
“1″“animals”
“dogs-are-great-companions”
SuperColumns
KeyValue
“post”
Columns
NameValue
“title”“Dogs are great companions”
“body”“Bla bla bla… long story…”
“author”“John Steward”
“created”“01/05/2009″
“tags”
Columns
NameValue
“0″“dogs”
“1″“animals”
“i-am-allergic-to-animals”
SuperColumns
KeyValue
“post”
Columns
NameValue
“title”“I am allergic to animals”
“body”“Bla bla bla… long story…”
“author”“Eric Long”
“created”“01/01/2003″
“tags”
Columns
NameValue
“0″“allergy”
“1″“animals”

Every row in the family consists of two SuperColumns, one for the posting itself and one for retaining a list of tags. The key of the rows is created like this on purpose for simple access, if we want to express a URL we can just add this part to the URL like:

http://myblog.com/i-am-allergic-to-animals

Finally we have the Tags ColumnFamily for associating the Posts with the Tags:

ColumnFamily: Tags
KeyValue
“cats”
Columns
NameValue
“Cats are funny animals”“cats-are-funny-animals”
“dogs”
Columns
NameValue
“Dogs are great companion”“dogs-are-great-companions”
“allergy”
Columns
NameValue
“I am allergic to animals”“i-am-allergic-to-animals”
“animals”
Columns
NameValue
“Cats are funny animals”“cats-are-funny-animals”
“Dogs are great companion”“dogs-are-great-companions”
“I am allergic to animals”“i-am-allergic-to-animals”

As you can see, every tag that is mentioned in a post is used here as a key, for every key we keep track of which posts have that tag. If we now display a post on the screen and show a listing of all the tags associated with it, the user could click on a tag and see all the posts that have a similar tag. This is a sort of indexing that would otherwise not be possible since we cannot search through all the posts in the Posts ColumnFamily by tag. So now we have a clearer understanding of how the data is arranged in the database we can start working on our client.

This posting also includes a Sample code (947) with all the source code in a Eclipse project, the project also contains some extra examples that you can have a look at. When you have the project loaded into your Eclipse you will probably see some error messages, this has to do with a variable in your project libraries. Change the target folder of the variable in the following way:

Right click on the project and choose Properties.
Click on the Java Build Path in the left part and then click on the Libraries tab. Now click on one of the lines that mention CASSANDRA_HOME and choose the Edit… button. Now you can click on the Variable… button to create a new variable with the name CASSANDRA_HOME, make sure it points to your Cassandra home folder ( for example C:/dev/apache-cassandra-0.6.0-beta2 ).

A client needs a number of libraries on it’s classpath to be able to communicate with Cassandra, all these libraries already exist within the CASSANDRA_HOME/lib folder of your Cassandra installation:

  • apache-cassandra-x.x.x-betax.jar (The Cassandra libraries which also contains the customization of the Thrift communications protocol)
  • slf4j-log4jxx-x.x.xx.jar (Plugin for SLF4J to add support for Log4j)
  • slf4j-api-x.x.xx.jar (Wrapper for various logging frameworks)
  • log4j-x.x.xx.jar (Log4J logging framework)
  • libthrift-xxxx.jar (The Thrift communication protocol)

In Installing and using Apache Cassandra With Java Part 1 you can find the download locations and what versions of the libraries i am using.

First of all we need to be able to make a connection to the database, to do that we are going to open a connection on port 9160, which is the default port for Cassandra, and hand it over to the Cassandra client, the client will take care of the communication with the server:

import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;

import org.apache.cassandra.thrift.Cassandra;

...

TTransport transport = new TSocket("localhost", 9160);
TProtocol protocol = new TBinaryProtocol(transport);
Cassandra.Client client = new Cassandra.Client(protocol);
transport.open();

So now that we know how to open a connection we also need to close the connection in a descent way, the flush that is done before the close takes care of any data that is still residing in the transport buffer:

transport.flush();
transport.close();

First we will have a look at how to store a new Author into the database:

Map<String, List<ColumnOrSuperColumn>> data = new HashMap<String, List<ColumnOrSuperColumn>>();
       
List<ColumnOrSuperColumn> columns = new ArrayList<ColumnOrSuperColumn>();

// Create the email column.
ColumnOrSuperColumn c = new ColumnOrSuperColumn();
c.setColumn(new Column("email".getBytes("utf-8"), "ronald (at) sodeso.nl".getBytes("utf-8"), timestamp))
columns.add(c);

// Create the country column.
ColumnOrSuperColumn c = new ColumnOrSuperColumn();
c.setColumn(new Column("country".getBytes("utf-8"), "Netherlands, The".getBytes("utf-8"), timestamp))
columns.add(c);

// Create the registeredSince column.
ColumnOrSuperColumn c = new ColumnOrSuperColumn();
c.setColumn(new Column("registeredSince".getBytes("utf-8"), "01/01/2010".getBytes("utf-8"), timestamp))
columns.add(c);

data.put("Authors", columns);
       
client.batch_insert("Blog", "Ronald Mathies", data, ConsistencyLevel.ANY);

First thing you will probably notice is the amount of code needed to get this done, i will provide some solutions to this in a later posting because we still want to focus on using Thrift. So what is happening here, first we create the timestamp. The timestamp is used by the columns and is always needed, never ever try to put just 0 in there, this will get you into trouble when you are using multiple nodes which rely on this timestamp to see if the data you are receiving is the correct data.

Then we create a Map which will hold the rows that we want to insert. The Map contains a String, which is the name of the ColumnFamily that we defined in the configuration file.

The List contains ColumnOrSuperColumn objects. Cassandra supplies us with an aggregate which contains either a Column or a SuperColumn, it is not possible to set both of them (in code it would be no problem, but you will notice when you execute the code since you will receive exceptions).

Then we create the List instance which will contain our Columns that we want to store, in this case the columns for the email, country and registeredSince. After this is done we add the list of columns to the map.

Finally we use the client.batch_insert to store everything at once. It is possible to just store a single Column or SuperColumn, in that case you would use the client.insert method. But in our case we want to store the complete Author at once. As you may have noticed we provide the key of the Author in the batch_insert method and the ColumnFamily in the outer most Map. This allows us to actually create rows in multiple ColumnFamilies at once which have the same row Key.

All the methods that are provided by the client of Cassandra have a ConsistencyLevel parameter. This parameter is used for both read and write actions and it will determine when the request made by the client is successful, the Cassandra Wiki has an excellent explanation of what the various options are. For now it is sufficient to know that the option ANY means that a write action is successful when it has been written to at least one node. For read action we will use ONE which means that the first node will respond with the data (and more is done but for the moment that is not important).

Suppose we have created a number of authors it would also be nice if we can retrieve them again. So lets retrieve a single author back:

import org.apache.cassandra.thrift.SlicePredicate;
import org.apache.cassandra.thrift.SliceRange;
import org.apache.cassandra.thrift.ColumnOrSuperColumn;
import org.apache.cassandra.thrift.ColumnParent;
import org.apache.cassandra.thrift.ConsistencyLevel;

...

SlicePredicate slicePredicate = new SlicePredicate();
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[] {});
sliceRange.setFinish(new byte[] {});
slicePredicate.setSlice_range(sliceRange);

List result =
  client.get_slice("Blog", "Ronald Mathies", new ColumnParent("Authors"), slicePredicate, ConsistencyLevel.ONE);

First we create a SlicePredicate, the predicate is used to tell Cassandra what data you want to fetch. Here we say that we want to get a range of columns. Remember the sorting rules? they are important in this case, the start and finish use the name attribute of the column, so it will use the sorting mechanism to know if the column should be retrieved or not. Here we just pass in new byte[] {} which is just an empty byte array. For Cassandra it means that we want to fetch every column that belongs to the Author.

The result will contain a list of ColumnOrSuperColumn objects that in turn will contain the Column objects containing our data. Now when we would want to fetch all or a number of authors then we would use the following:

import org.apache.cassandra.thrift.KeyRange;

KeyRange keyRange = new KeyRange(3);
keyRange.setStart_key("");
keyRange.setEnd_key("");

SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[] {});
sliceRange.setFinish(new byte[] {});

SlicePredicate slicePredicate = new SlicePredicate();
slicePredicate.setSlice_range(sliceRange);

List keySlices = client.get_range_slices("Blog",
  new ColumnParent("Authors"), slicePredicate, keyRange, ConsistencyLevel.ONE);

for (KeySlice keySlice : keySlices) {
  keySlice.getKey();
  keySlice.getColumns();
}

It is not much different then before, the method has changed to get_range_slices and instead of only having a SlicePredicate we also pass in a KeyRange. The KeyRange is used to specify which keys fall into you search criteria. In our case we actually don’t care what the keys itself are, what we do care about is the number of rows, we only want the first three rows.
The list of KeySlice objects that are fetched contain the row key and the columns belonging to that row.

So far we have only used the SlicePredicate to retrieve all columns, now here is an example where we only retrieve the columns we specify:

List<byte[]> columns = new ArrayList<byte[]>();
columns.add("email".getBytes());
columns.add("registeredSince".getBytes());
       
SlicePredicate slicePredicate = new SlicePredicate();
slicePredicate.setColumn_names(columns);

So now that we have seen a few of the search methods we would also like to change the data. For this we will use the batch_update method. Suppose that we want to change the e-mail address:

long timestamp = System.currentTimeMillis();
Column column =
  new Column("email".getBytes("utf-8"), "ronald@mathies.nl".getBytes("utf-8"), timestamp);

ColumnOrSuperColumn columnOrSuperColumn = new ColumnOrSuperColumn();
columnOrSuperColumn.setColumn(column);

Mutation mutation = new Mutation();
mutation.setColumn_or_supercolumn(columnOrSuperColumn);
   
List<Mutation> mutations = new ArrayList<Mutation>();
mutations.add(mutation);
   
Map<String, List<Mutation>> job = new HashMap<String, List<Mutation>>();
job.put("Authors", mutations);

Map<String, Map<String, List<Mutation>>> batch =
  new HashMap<String, Map<String, List<Mutation>>>();
batch.put("Ronald Mathies", job);

client.batch_mutate("Blog", batch, ConsistencyLevel.ALL);

First we create the Column that contains the actual mutation that we want to do, we supply it with the correct key and the new email address. Then we need to add the Column to the ColumnOrSuperColumn aggregate and the aggregate into a Mutation object. The Mutation object can also be used to create new Columns or to delete a Column, if you supply it with a key that doesn’t exist in the database it will create it, if it detects that the key already exists it will overwrite the value and the timestamp. Finally you can also put a Deletion object in a Mutation to tell Cassandra that you actually want to remove a Column, we will see an example of that later on. We then add it to the list and into the Map ColumnFamily. Finally we put it in the most outer Map with the key of the row.

So now we have covered the insertion, selection and mutating of the data so that only leaves deleting data. Suppose we want to remove an Author completely, this is quite simple:

long timestamp = System.currentTimeMillis();
client.remove("Blog", "Ronald Mathies", new ColumnPath("Authors"), timestamp, ConsistencyLevel.ALL);

We use the remove method and supply it with the keyspace, the key of the author and the Author ColumnFamily. That is. Now if we wanted to remove a single column from an Author we have to do a bit more:

import org.apache.cassandra.thrift.Deletion;

...

long timestamp = System.currentTimeMillis();

List columns = new ArrayList();
columns.add("email".getBytes());

SlicePredicate slicePredicate = new SlicePredicate();
slicePredicate.setColumn_names(columns);

Deletion deletion = new Deletion(timestamp);
deletion.setPredicate(slicePredicate);

Mutation mutation = new Mutation();
mutation.setDeletion(deletion);

List<Mutation> mutations = new ArrayList<Mutation>();
mutations.add(mutation);

Map<String, List<Mutation>> job = new HashMap<String, List<Mutation>>();
job.put("Authors", mutations);

Map<String, Map<String, List<Mutation>>> batch =
  new HashMap<String, Map<String, List<Mutation>>>();
batch.put("Ronald Mathies", job);

client.batch_mutate("Blog", batch, ConsistencyLevel.ALL);

It is basically the same as what we were doing when we were modifying data, except that we don’t create any Columns but we create the Deletion object which contains a predicate to define what we want to delete. In this sample we use the SlicePredicate to store a list of column names that we want to remove.

It is allowed to combine the actions, so a batch can contain the removal of two columns, creation of one column and modification of four existing columns, they are all separate Mutation objects. The sample project attached to this post contains a method that does this.

We have now covered all the basics for a standard ColumnFamily with normal Column objects. For a SuperColumnFamily it looks quite the same except that you have an extra level within the data structure, we will cover that the next time and we will talk about some other things.

Hope you have enjoyed this posting, if you have any remarks or suggestions then please let me know, if you want to know when the next posting is live then just subscribe to my twitter account..

The following download contains an Eclipse project with all the source code (and some more) that we talked about in the posting:

Sample code (947)

When you have the project loaded into your Eclipse you will probably see some error messages, this has to do with a variable in your project libraries. Change the target folder of the variable in the following way:

Right click on the project and choose Properties.
Click on the Java Build Path in the left part and then click on the Libraries tab. Now click on one of the lines that mention CASSANDRA_HOME and choose the Edit… button. Now you can click on the Variable… button to create a new variable with the name CASSANDRA_HOME, make sure it points to your Cassandra home folder ( for example C:/dev/apache-cassandra-0.6.0-beta2 ).

  • Add to favorites
  • Digg
  • del.icio.us
  • DZone
  • Reddit
  • StumbleUpon
  • Slashdot
  • Tumblr
  • Twitter
  • FriendFeed
  • Facebook
  • Google Bookmarks
  • MySpace
  • Faves
  1. porchamt
    March 29th, 2010 at 20:55 | #1

    Thanks for this nice tutorial.

    I generated the java source out of cassandra.thrift and used it instead of putting apache-cassandra-x.x.x-betax.jar on the Build Path.

    Any plans to move this content to the official cassandra wiki?

    Cheers,

    Thomas

  2. ibillguo
    March 30th, 2010 at 02:37 | #2

    One question, how to sort by column?
    Such as I have a Word Freqency list, and want to sort by frequency

  3. March 30th, 2010 at 06:50 | #3

    @ibillguo
    I think it is best if you also read the third (http://www.sodeso.nl/?p=207) part which talks about sorting, but in short, Cassandra doesn’t allow sorting of data when you query it. On a ColumnFamily you can specify sorting for the keys using various sorting types or a custom sorting class. All the data you store within that ColumnFamily gets sorted implicitly and is stored in a sorting manner. In your example you would have to use an inverted index, which basically means that you create an extra ColumnFamily where the key would be the frequency and the column would be the key for retrieving data from the ColumnFamily which contains the information you want to display. If the frequency is a number you could just use the LongType for sorting.

  4. March 30th, 2010 at 06:54 | #4

    @porchamt
    Yes there are plans to move a part of the blog postings to the official Cassandra Wiki, i was contacted by Jonathan Ellis (Project Chair at Cassandra) who asked if i could do a rewrite of the DataModel Wiki page which basically explains what i explain in part 2 & 3. It will be in a similar style using the diagrams and examples.

    Personally i am very pleased with this request as it acknowledges that i am doing a good job on the postings.

  5. ibillguo
    March 30th, 2010 at 09:42 | #5

    @Ronald Mathies
    The question is this:
    There will have a lot words with the same frequency…

  6. ibillguo
    March 30th, 2010 at 09:47 | #6

    ibillguo :
    @Ronald Mathies
    The question is this:
    There will have a lot words with the same frequency…

    One solution I have it’s put them in a supercolumn
    and seperate them with a unique id

  7. ibillguo
    March 30th, 2010 at 09:54 | #7

    ibillguo :

    ibillguo :
    @Ronald Mathies
    The question is this:
    There will have a lot words with the same frequency…

    One solution I have it’s put them in a supercolumn
    and seperate them with a unique id

    Sorry, should be put the name as the subcolumn’s key, and put any number as the value

  8. March 30th, 2010 at 10:22 | #8

    @ibillguo
    The name would be the word? and the number would be the frequency (and by frequency i think you mean the number of times that word occured?)

    Because only the Columns name and the SuperColumns key are being sorted. What if you would store it like this:

    |12|word4|
    |12|word5|
    |6|word2|
    |4|word3|
    |1|word1|

    Now when you retrieve the data you can say that you want the first three lines which would result in:

    |12|word4|
    |12|word5|
    |6|word2|

  9. ibillguo
    March 30th, 2010 at 14:12 | #9

    Ronald Mathies :
    @ibillguo
    The name would be the word? and the number would be the frequency (and by frequency i think you mean the number of times that word occured?)
    Because only the Columns name and the SuperColumns key are being sorted. What if you would store it like this:
    |12|word4|
    |12|word5|
    |6|word2|
    |4|word3|
    |1|word1|
    Now when you retrieve the data you can say that you want the first three lines which would result in:
    |12|word4|
    |12|word5|
    |6|word2|

    Yes, I am storing it something like this
    Thanks a lot for your help

  10. John
    April 1st, 2010 at 18:17 | #10

    “I think it is best if you also read the third (http://www.sodeso.nl/?p=207) part which talks about sorting, but in short, Cassandra doesn’t allow sorting of data when you query it. On a ColumnFamily you can specify sorting for the keys using various sorting types or a custom sorting class. All the data you store within that ColumnFamily gets sorted implicitly and is stored in a sorting manner. In your example you would have to use an inverted index, which basically means that you create an extra ColumnFamily where the key would be the frequency and the column would be the key for retrieving data from the ColumnFamily which contains the information you want to display. If the frequency is a number you could just use the LongType for sorting.”

    I think you mixed the concepts of “key” and “column name” here. Key is supposed to refer to the value used to identify the whole row, while column name is just a name of the column within each row. The sorting is on the column values within each columnfamily, not the keys.

  11. April 1st, 2010 at 20:51 | #11

    Ah yes, indeed, i re-read my own explanation in the third part and it is correctly described there. The only time when a key gets sorted is when you use the SuperColumns, then you can specify two sorting options, one for the key of the SuperColumn (CompareWith) and one for the name attribute of the columns (CompareSubcolumnsWith)

  12. April 19th, 2010 at 19:09 | #12

    In the update example, you have a Map named “batch” but then don’t do anything with it, batch_mutate is called with the “job” Map. Also, before the example, you call it “batch_update”

  13. April 19th, 2010 at 20:33 | #13

    @Ian Kallen
    Thanks for the update, made the change, also the example after that was corrupted (generics gone wrong or so), also updated that one.

  14. paly
    June 9th, 2010 at 19:29 | #14

    Hi,

    Is there a “count” function, just to retrieve the number of rows in a specific range ?

    Are there also some aggregate function like min / max / avg ?

    Thanks a lot for this great page,

    PALY

  15. June 9th, 2010 at 19:38 | #15

    @ paly
    About the aggregate functions:

    No there are no aggregate functions, the only option you have is to calculate them on the client side and for example store them within the server when you can re-use them.

    About the count function:

    Yes there is, the get_count method (Java Thrift client), you can read more about it on the following page:

    http://www.sodeso.nl/?p=354

  16. mehmet
    June 30th, 2010 at 09:29 | #16

    Thanks for your tutorial but when i applied all of these things i get error messages like :

    Remove all the authors we might have created before.

    InvalidRequestException(why:Keyspace Blog does not exist in this schema.)
    at org.apache.cassandra.thrift.Cassandra$remove_result.read(Cassandra.java:14354)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_remove(Cassandra.java:755)
    at org.apache.cassandra.thrift.Cassandra$Client.remove(Cassandra.java:729)
    at Authors.removeAuthor(Authors.java:141)
    at Authors.main(Authors.java:59)
    InvalidRequestException(why:Keyspace Blog does not exist in this schema.)
    at org.apache.cassandra.thrift.Cassandra$remove_result.read(Cassandra.java:14354)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_remove(Cassandra.java:755)
    at org.apache.cassandra.thrift.Cassandra$Client.remove(Cassandra.java:729)
    at Authors.removeAuthor(Authors.java:141)
    at Authors.main(Authors.java:60)

    Moderator: I have removed some of the stacktraces since they are all similar and the first one already makes it clear

    when i run authors.java

    what is wrong with me ??

    Thank you very much

  17. June 30th, 2010 at 09:59 | #17

    There is nothing wrong with you :)

    However, have you defined the keyspaces in your storage-conf.xml? Because the message you get tells me that they aren’t.

    Kan you post your keyspace definition from your storage-conf.xml so i can have a look that it is correct?

  18. mehmet
    June 30th, 2010 at 10:53 | #18

    ı Dont know why but when i write what i have for keyspace it cant be seen here

  19. mehmet
    June 30th, 2010 at 10:54 | #19

    what i have for keyspace is the same as u wrote in this page.

  20. June 30th, 2010 at 11:00 | #20

    @ mehmet

    Hmm, XML markup doesn’t go well here, i have send you and e-mail, can you send your storage-conf.xml as an attachement back to me?

  21. mehmet
    June 30th, 2010 at 11:01 | #21

    this is what is written in storage-conf.xml, i hope that will be seen by you .

    org.apache.cassandra.locator.RackUnawareStrategy
    1
    org.apache.cassandra.locator.EndPointSnitch

  22. June 30th, 2010 at 11:07 | #22

    @ mehmet

    I see the problem, you currently have two Keyspaces elements, however, the configuration requires a single Keyspaces element and within that element are multiple Keyspace definitions:

    So it is:

    Keyspaces

    Keyspace Name=”Blog”
    /Keyspace

    Keyspace Name=”Keyspace1″
    /Keyspace

    /Keyspaces

    (there should be larger / smaller then signs but this has some problems in the comments.

  23. mehmet
    June 30th, 2010 at 11:24 | #23

    ı did it :)
    Thank you very very much for help :)
    u saved my day :) ))

  24. August 13th, 2010 at 07:50 | #24

    Recieved by e-mail:

    Can you explain why the batch insert example is:

    data.put(“Authors”, columns);
    client.batch_insert(“Blog”, “Ronald Mathies”, data, ConsistencyLevel.ANY);

    instead of:

    data.put(“Ronald Mathies”, columns);
    client.batch_insert(“Blog”, “Authors”, data, ConsistencyLevel.ANY);

    What if I wanted to insert multiple authors at once (i.e. data would have more than one entry)? Wouldn’t they all end up having the same key?

  25. August 13th, 2010 at 07:59 | #25

    @ Ronald Mathies

    Sadly (well, actually for good reasons), no, you cannot switch them around since the “Authors” is the ColumnFamily name.

    The reason it works like this is because you can store several different structures for the same key in different ColumnFamilies. Allowing you to store mutual related data all at once. I can imagine that the naming of the method is a bit confusing since it is called batch_xxx, so yes, in the beginning i also thought that it would be possible to store several keys at the same time.

  1. March 29th, 2010 at 13:06 | #1
  2. April 21st, 2010 at 08:11 | #2
  3. June 5th, 2010 at 22:11 | #3
Did you know that it is also possible to register as a user? this enables you to create comments without constantly specifying your name, e-mail and captcha code. Register