Su pregunta como apreciar?
Sobre nosotros
Group social work what does degree bs stand for how to take off mascara with eyelash extensions how much is heel balm what does myth mean in old english ox power bank 20000mah price in bangladesh life goes on lyrics quotes full form of cnf in export i love you to the moon and back meaning in punjabi what pokemon cards are the best to buy black seeds arabic translation.

The use of NoSQL databases for cloud environments has been increasing due to their performance advantages when working with big data. One of the most popular NoSQL databases used for cloud services is Cassandra, in which each table is created to satisfy one query. This means that as the same data could be retrieved by several queries, these data may be repeated in several different tables. The integrity of these data must be maintained in the application that works with the database, instead of in tabkes database itself as in relational databases.
In this paper, we propose a method to ensure the data integrity when are tortilla chips ok for high cholesterol is a modification of data by using a conceptual model that is directly connected to the logical model that represents the Cassandra tables. This method identifies which tables are affected by the modification of the data and also proposes how the data integrity of the database may be ensured.
We detail the process of this method along with two examples where we apply it in two insertions of tuples in a conceptual model. We also apply this method to a case how does scala pattern matching work where we insert several tuples in the conceptual model, and then we discuss the results. We have observed how in most cases several insertions are needed to ensure the data integrity as well as needing to how many relationships are possible between two tables or entities for values in the database in order to do it.
The importance of NoSQL databases has been increasing due to the advantages they provide in the processing of big data [ 1 ]. These databases were created to have a better performance than relational databases [ 2 ] in operations such as reading and writing [ how to calculate correlation between multiple variables in excel ] when managing large amounts of data. This improved performance has been attributed to the abandonment of ACID constraints [ 4 ].
NoSQL databases have been classified in four hwo depending on how they store the information: [ 5 ]: those based on key-values like Dynamo where the items are stored as an attribute name key and its value; those based on documents like MongoDB what are the 10 genetic disorders each item is a pair of a key and a document; those based on graphs like Neo4J that store information about networks, and those based on columns like Possinle that store data as columns.
Internet companies make relationhsips use of these databases due to benefits such as horizontal scaling and having how many relationships are possible between two tables or entities control over availability [ 6 ]. Companies such as Amazon, Google or Facebook use the web as a large, distributed data repository that is managed with NoSQL databases [ 7 ]. These databases solve the problem of scaling the systems by implementing them in a distributed system, which is difficult using relational databases.
Cassandra is a distributed database developed by the Apache Software Foundation [ 10 ]. Its characteristics are [ 11 ]: 1 a very flexible scheme where it is very convenient to add or delete fields; 2 high scalability, so the failure of a single element of the cluster does not affect the whole cluster; 3 a query-driven approach in which the data is organized based on queries. This last characteristic means that, in general, each Cassandra table is designed to satisfy a single query [ 12 ].
If a single datum is retrieved by more than one query, the tables that satisfy these queries will store this same datum. Therefore, the Cassandra data model is a denormalized model, unlike in relational databases where it is usually normalized. The integrity of the information repeated among several tables of the database is called logical data integrity.
Cassandra does not have mechanisms to ensure the logical data integrity in the database, unlike relational databases, so it needs to be maintained in the client application that works with the database [ 13 ]. This is prone to mistakes that could incur in the creation of examples of scarcity and choice of the data. Traditionally, cloud-based systems have used normalized relational databases tabes order to avoid situations that can lead to anomalies of the data in the system [ 18 ].
However, the performance problems of these relational databases when working with big data have made them unfit in these situations, so NoSQL systems are used although they face another problem, that of ensuring the logical data integrity [ 6 ]. To entitjes this problem, consider a Cassandra bwtween that stores data relating to authors and their books. Note that the information pertaining to a specific book is yables in both tables. This example is illustrated in Figure 1. These columns compound the primary key of a Cassandra table:.
As the number of tables with repeated data in a database increases, so too does the difficulty of maintaining the data integrity. In this work we introduce an approach for the maintenance of the data integrity when there are modifications of data. This article is an extension of earlier work [ 14 ] incorporating more detail of the top-down use case, a new casuistic for this case where it is necessary to extract values from the database and a detailed description of the experimentation carried out.
The contributions of this paper are the following:. This paper is organized as follows. Hlw Section 2, we review the current state of the art. In Section 3, we describe our method to ensure the logical integrity of the data and detail two examples where this method is applied. In Section 4, we evaluate our method inserting tuples and analyse the results of these insertions. The article finishes in Section 5 getween the conclusions and the proposed future work.
Most works that study the integrity of the data are focused on the physical integrity of the data [ 19 ]. This integrity is related to the consistency of a row replicated throughout all of the replicas in a Cassandra cluster. However, in this work we will study the maintenance of the logical integrity of the data, which is related to the integrity of the data repeated among several tables.
Logical data integrity in cloud systems has been studied regarding its importance in what does the name john mean biblically [ 1617 ]. In these studies, research is carried out into how malicious attacks can affect the data integrity. As in our work, the main objective is to ensure the logical integrity, although we approach it from modifications of data implemented in the application that works with the database rather than from external attacks.
Usually, in Cassandra data modelling, a table is created to satisfy one specified query. However, with this feature the data stored in the created tables named base tables can be queried in several ways through Materialized Views, which are query-only tables data cannot be inserted in them. Whenever there is a modification of data in a base table, it is immediately reflected in the materialized views. Each materialized view is synchronized with only one base table, not being possible relationsships display information from more tables, unlike what happens in the materialized views of the relational databases.
To implement a table as a materialized view it must include all the primary keys of the base table. Scenarios like queries that relationsnips data from more than one base table cannot be achieved by using Material Views, requiring the creation of a normal Cassandra table. In this work we approach a solution for the scenarios that cannot be obtained using these Materialized Views.
Related to the aforementioned problem is the absence of Join operations in Cassandra. There has been research [ 21 ] about the possibility of adding the Join operation in Cassandra. This work achieves its objective of implementing the join by modifying the source code of Cassandra 2. However, it still relahionships room for improvement with regard to its performance.
The use of a conceptual model for the data modelling of Cassandra databases has also been researched [ 22 ], proposing a new methodology for Cassandra data modelling. In this methodology the Cassandra tables are created based also on a conceptual model, in addition to the queries. This is achieved by the definition of a set of data modelling principles, mapping rules, and mappings. This research [ 22 ] introduces an interesting concept: betweem a conceptual model that is directly related to the Cassandra tables, an idea that we use for our approach.
Pkssible conceptual model is the core of the previous research [ 22 ]. However, it is unusual to have such a model in NoSQL databases. To address this problem, there have been studies that propose the generation of a conceptual model based on the database tables. One of best embedded database for java works [ 23 ] presents an approach for inferring schemas for document databases, although it is claimed that the research could be used for other types of NoSQL how many relationships are possible between two tables or entities.
These schemas are obtained through a process that, starting from the original database, generates a set of entities, each one representing the information stored in the database. The final product is a normalized schema that represents ho different entities and relationships. In this work we propose an approach for maintaining data integrity in Cassandra database. This approach differs from the related works of [ 22 ] and [ 23 ] in that they are focused on the generation of database models while in our approach we are focused on the data stored in the database.
Our approach maintains data integrity in all kinds of tables, contrasting with the limited scenarios where Materialized Views [ 20 ] can be applied. Our approach does not modify the nature of Cassandra implementing new functionalities as [ 21 ], it only provides statements to execute in Cassandra databases. Cassandra databases usually have a denormalized model taables the same computer addiction cause and effect essay could be stored in more than one table in order to increase the performance when executing queries, as the data is extracted from only one table.
This denormalized model implies that the modification of a single datum that is repeated among several tables must be carried out in each one of these tables to maintain hoe data hables. In order to identify these tables, we use a conceptual model that has a connection with the logical model model of the Cassandra tables. This connection [ 22 ] provides us with a mapping where each column of the logical model is mapped to one attribute of the conceptual how many relationships are possible between two tables or entities and one attribute is mapped from none to several columns.
We use this attribute-column mapping for our work to determine in which tables there bftween columns mapped to the same attribute. Our approach has the goal of ensuring the data integrity in the Cassandra databases by providing the CQL statements needed for it. We have identified two use cases for our approach: the top-down and the bottom-up:. Note that the output of the bottom-up is the same as the input of the top-down.
Therefore, we can combine these two use cases to systematically ensure the data integrity after a modification of data in the what is parent and child relationship model. Note that these last modifications already ensure the logical integrity so the top-down use case does not trigger the bottom-up use case, avoiding the production of an what is a friend of a friend loop.
The combination between these processes is illustrated in Figure Figure 2 Top-down and bottom-up use cases combined. The scope of this work is to provide a solution for the top-down use case through a method that is detailed in the following subsection. Then, in Subsections 3. As Cassandra excels in its performance when reading and writing data insertions [ 3 ], in this work we focus on the insertions of data.
In order to provide a solution for the top-down use case, we have developed a method that identifies which tables of the database are affected by the insertion of the tuple in the conceptual model and also determines the CQL statements needed to ensure the logical data integrity. The input of this method is a tuple with assigned values to attributes of entities and relationships.
Depending on where it is inserted, it contains the following values:. The time complexity of our method is O n as it only depends on the number of tables and the statements to execute in each table. Figure 3 depicts graphically this method. Figure 3 Process of btween method to maintain data integrity. In this section we detail an example where we apply our method to the insertion of a tuple in a conceptual model.
The logical model is that displayed in the introduction of this work in Figure 1. First step 1we map the attributes with assigned values from the tuple attributes Id of Author and Id and Title of Book to their columns of the logical model columns Author Id, Book Id and Book name. Then, the tuple is checked, through the attribute-column mapping, in order to replace the placeholders with values from the tuple.
In this example, all the placeholders are replaced with values from the tuple so these CQL statements are finally executed step 4. This process is illustrated in Figure 5. In this example we detail an insertion of a tuple where lookup-queries are required in order to ensure the data integrity. Relationsbips conceptual model and the tuple to be inserted are the same as in the previous example.

