{{ message }}
Document Oriented Query Partitioning Technique Implementation#319
Merged
mrigger merged 24 commits intoMar 30, 2021
Merged
Conversation
Adding support for mongodb database version 4.4. With this commit random collections are created with optional validation options which can enforce the schema on inserts. Additionally the execution is logged in mongo shell code.
Random Document insertion with the option to follow the schema validation or set random types is now supported. Also an option has been added to randomly set an insert to null. Validation has to be turned off for these options though. Additionally, random indexes are created over random columns, either ascending or descending and when there are multiple indexes added for a single collection it creates a composite index out of the other ones.
The project stage is similar to the SELECT in SQL where we can specify which columns are returned. The Lookup stage is similar to the LEFT OUTER JOIN of SQL, for this we have to specifically make new random columns that specify the join column.
The query ast has multiple key difference to the SQL version. The core is the MongoDBSelect class that holds projection and lookup lists. The filter is then a tree similar to the SQL version. In order to execute and log, which are different because one are API calls and the other MongoDB shell commands there exist two visitors, the ToQueryVisitor and the ToLogVisitor.
Computed functions allows to perform arithmetic operations in queries. It is part of the projection pipeline stage, where a new field is added that holds the result. This update adds support for random computed fields with functions such as add, multiply, pow, sqrt and more.
…on in MongoDB Similar to binary comparison regular expression is a new Leaf node that can be added in the tree. To generate the pattern the existing random string generator is used.
… regular expression Changes to the MongoDBComparatorHelper allow for expected errors to be ignored if they are thrown in a query execution. This feature is mainly necessary for all the illformed regular expressions that the randomized string produces.
Due to the randomly generated computed field tree that has random types as leaf nodes, there are exceptions that we ignore such as wrong type, value not positive for sqrt for example and similar issues.
The options are flags that can be set in a string such as "im". The expression generator now generates a random valid options for the regular expression operator in MongoDB.
… documents in MongoDB This new variation introduces a variant where we execute the pipeline query once with a count as a last pipeline stage and compare the output to the result set without the count stage.
Until now not has been simulated by nor(id exists, bool_expression) and it has led to believe that the underlying structure works fine. Now after the rework the not gets evaluated and every logical operation inverted by the new NegateVisitor. At the lowest stage type problems remain.
This change reveals some issues with how we form queries for mongodb. The core of it seems to be that neither greater equal nor smaller include null values and it is rather hard to define a query that finds the ones that are null. The query that projects includes all.
…serting data For four data types of ArangoDB integer, double, string and boolean, this commit supports creating collection, keeping track of the schema, creating tables, collections and inserting randomized data.
ArangoDB now randomly generates queries that support binary comparison, binary logic operations such as or/and and the unary prefix not. Also everything is logged and the results are checked with the new ComparatorHelper.
A new generator and query has been added to support the new functionality.
Similar to MongoDB, in ArangoDB variables for computed values are created and with a special keyword LET calculated.
This oracle generates a random query, executes it and if the result set is not empty, chooses a document at random and removes it from the collection. The query is executed again to check if the document is really removed and at the end a new document is generated and inserted to make sure that the dataset is not decreasing in size.
When using cosmos make sure to set the configuration string in CosmosProvider.
mrigger
reviewed
Mar 24, 2021
mrigger
left a comment
Contributor
There was a problem hiding this comment.
Thanks a lot for the PR and the clean code! I only have some minor nitpicks. Perhaps you can go over them and see which ones are worth addressing (also considering the risk of introducing an untested change).
Would you also be able to add a check for our CI (see https://github.com/sqlancer/sqlancer/blob/master/.github/workflows/main.yml)?
Sorry, something went wrong.
2 hidden conversations
Load more…
See Pull request sqlancer#319 on github.com/sqlancer for more information.
Contributor
|
LGTM. Thanks a lot! |
Sorry, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.
Includes extended support for MongoDB, Cosmos and ArangoDB.