GraphQL – Everything You Need to Know

GraphQL – Everything You Need to Know

GraphQL is an open-source server-side library created by Facebook to enhance RESTful API calls. The purpose of GraphQL was to decrease the load on server. Before GraphQL, users had to call all the data just to get certain information from the database. With GraphQL, getting server information has become much easier and a lot faster.

  • You can implement GraphQL API using the Apollo server, it is completely free to use.
  • GraphQL works with multiple languages but for this particular post, we will be using NodeJS and NPM
  • If you are not familiar with ReactJS, don’t worry, the tutorial is not any more overwhelming than what you already know.

Using GraphQL – Why is it Necessary?

RESTful APIs serve only one purpose – to fetch data from an API successfully. But, what if it is not possible to request the data through a single request? This puts excessive load on the server and increases the response time. This is where GraphQL comes in. GraphQL offers a structured approach to get data requests through its simple yet powerful query syntax. GraphQL can successfully traverses, retrieve, and modify data.

Get What You Want

The best part about GraphQL is that you get exactly what you want. GraphQL queries can be used on all opensource data in a fast and predictable way.
Let’s take an example of how you can use the GraphQL API to get data from an API.
There is an object Movie with attributes Name, Genre, Year. Suppose you want to fetch only the Name and Year for the movie. With a REST API /api/v1/movies, you will end up fetching all the data in the Movie object. GraphQL solves this problem by fetching only what matters.

Fields

A GraphQL query for the above data would be like

{
Movie {
name
year
}
}

Returns

{
“data”: {
“movie”: [
{
“name”: “avengers”,
“year”: “2019”
},
{
“name”: “justice league”,
“year”: “2017”
}
]
}
}

In the above code, each parameter is known as ‘Field’ in GraphQL.
Now, what if you would want to get more details about the movie object? GraphQL also lets you do that.

{
movies{
name
year
genre
studio{
name
location
}
}
}

This is a simple query that you will use to get data from the API. The result of this query will be like:

{
“data”: {
“movies”: [
{
“name”: “avengers”,
“year”: “2019”,
“genre”: “action”,
“studio”: {
“name”: “Marvel”,
“location”: “USA”
}
},
]
}
}

The use of sub-data is called as ‘Sub-selection’ in GraphQL. In our example, we have the sub-selection of ‘Name’ and ‘Location’.

Types

type Query {
movies:[Movie]
}type Student {
name:String
year:int
studio:studio
budget: Float

}

By describing type systems, you reduce errors in fetching unnecessary or wrong information from the API call.

Arguments

GraphQL is a highly useful language when you know what type of data you want from each request. Let’s say if you know the ID of a person but want to know their Name and DOB, you can pass the ID as an argument.

{
Person(id: “1”) {
name
dob
}
}

The result will show the Name and the DOB of the person whose ID you have passed in as the argument.

{
“data”: {
“person”: {
“name”: “Marco Polo”,
“dob”: 1/11/19
}
}
}

Arguments can be of many different types. In our example, we have passed-in a finite argument but GraphQL allows you to create custom types of arguments for more specific data sets. Learn more about it here.

Aliases

GraphQL also allows you to get data from different datasets through a single query. In the example below, we have used Movie for both datasets, but as you can see the results are different.
We will call this ‘Aliases’ technique.

{
avengers: movie(episode: Avengers) {
name
}
endgame: hero(episode: Avengers Endgame) {
name
}
}

The result fetches different data for both.

{
“data”: {
“avengers”: {
“name”: “Iron Man”
},
“endgame”: {
“name”: “Thor”
}
}
}

Fragments

Fragments let you run two entirely different requests together. Let’s say you would want to compare two objects and their parameters together. This will get complex when you try it out, however, with fragments this is possible.

{
leftComparison: movie(episode: AVENGERS) {
…comparisonFields
}
rightComparison: movie(episode: ENDGAME) {
…comparisonFields
}
}fragment comparisonFields on Character {
name
budget
studio {
name
}
}

With the comparisonFields data you can easily compare complex queries by breaking them into small chunks. This is especially useful when you have to display multiple objects in a single UI.

Operation name

For the explanation purpose, we haven’t used the operational keywords. But when you run the GraphQL queries in a production environment you will have to use the proper structure. This includes query keyword, query name, and less ambiguous code.
An example of how to use keyword query as operation type is given below:

query MovieNameAndCast{
movie {
name
cast {
name
}
}
}

Variables

It is not feasible to pass dynamic arguments directly in the query because then the client-side will have to dynamically manipulate the query string at runtime. With GraphQL you can pass these dynamic values out of the query, as a separate dictionary. These values are called variables.
Instead, we will add the variable name $variablename to replace the static value. Then, we will pass the static value separately through the JSON library.

query MoviesOf2019($movie: Movie) {
hero(movie: $movie) {
name
friends {
name
}
}
}
Variable
{
“movie”: “AVENGERS”
}

In the above example, we have added the variable separately in the client code. This way, we won’t have to construct a separate query for other similar data.

Directives

Directives work just like the variables you pass within the query. But the purpose of a directive is to hire or show certain parts of the information.
For example, you can add or remove certain information in the content by introducing a Boolean operator. The Boolean operator can be changed later without changing the whole query.

query MovieName($movie: Movie, $withGenre: Boolean!) {
MovieName(movie: $movie) {
name
genre @include(if: $withGenre) {
name
}
}
}
Input Directives
{
“movie”: “AVENGERS”,
“withGenre”: true
}

Here ‘true’ shows the genre of the movie. However, you can hide it with the directive ‘false’.

Fulltext search with Apache Ignite

Apache Ignite is a memory centric platform for storage, analytics and computation. Hypi builds on top of Ignite to provide a low-code serverless platform that enables rapid application development.

To do this, Hypi makes heavy use of Ignite’s key value APIs. Ignite itself is very flexible and provides several APIs as a means of interacting with it and the data within it. These include Ignite’s SQL99 API which gives you access to data via standard SQL and several others.

Ignite features machine learning APIs and as of Ignite 2.7, Tensorflow integration. All things Hypi will be making available over time.

In this post, we will focus on one specific feature in Hypi., fulltext search. We’ve put out an introduction to Hypi’s query language before in the HypiQL post (now renamed to ArcQL). It covers the syntax/grammar but not much else.

Here, we’ll discuss briefly how that syntax maps down to actually finding your data in the Hypi platform.

You can skip some of this and jump into the slides we presented at the Paris and London in memory computing meetups in Feb 2019.

Hypi GraphQL Fulltext

Read on for a break down.

First, it helps to become aquatinted with how Hypi knows what data you want to put into the platform. It its core is GraphQL. The slides use a model that looks like this:

Hypi GraphQL TODO model

You’ll have noticed the use of @field in the type declarations. This is a Hypi directive that allows the developer of an app to customise some aspect of how Hypi deals with the field to which it is applied. In this image, two things are being done.

  1. indexed: true – This will cause Hypi to index the field, making its contents searchable via ArcQL
  2. type: Keyword – Where this is applied, the field will be used for exact matches. This is good for things like IDs, or emails that must match exactly. If this is not set, it defaults to Text which causes partial matches to work (like search engines)

The slides demonstrate the this model would produce a GraphQL API that looks like this:

Default GraphQL CRUD API generated by Hypi

Hypi allows an ArcQL parameter to later filter any data inserted with these models. ArcQL, being implemented on top of Apache Lucene supports many different types of queries. At present there are seven types:

Types of queries supported by ArcQL
  1. Term queries – these are used for matching on Keyword fields i.e. exact matches
  2. Phrase queries – enabled partial matching (fields indexed with Text as their type
  3. Prefix queries – enable matching the start of contents of a field
  4. Wildcard – enables matching any single character with ? or any number of characters with *
  5. Fuzzy queries – enable matching words that match if a few characters are changed e.g. name, tame or dame would all match a fuzzy query
  6. Range queries – allows finding values within a specific range, mostly numeric but can be used for strings as well
  7. Match all queries – used when you don’t know what to search for and just want to paginate all the data.

Hypi uses what’s called an affinity function in Ignite to determine a set of Ignite nodes on which it will keep the index for a cache. In general both index and raw data share nodes, it will be possible in the future to have dedicated nodes for storing indices only.

Using the Affinity function, Hypi consistently maps data around the cluster using Rendezvous hashing.

Hypi query and data routing based on Affinity function

Hypi implements a graph system on top of ignite, this combined with search capabilities enables the automatic resolution of links in the relationships found in the GraphQL models. This means that when referencing an object, there’s no need to store the “foreign key” and perform another query to resolve the foreign object, Hypi allows you to naturally express the relationship using the GraphQL SDL type system as demonstrated in the simple todo app model.

ArcQL fully supports this. In the query types above field name is shown as a simple one word field. In reality, ArcQL supports implicit JOIN queries that are the equivalent of doing a LEFT JOIN in a relational database.

For example, if you added comments to some items and wanted to find Item objects that had specific comments. A simple ArcQL query can be used to do this that may look like this

findItem(arcql: “comments.text ~ ‘the text to search for'”)

This simple but powerful query will find all items with comments matching the search text. What’s more, given that it’s a GraphQL query, if the GraphQL selection includes the comment field as in the below, then Hypi will automatically resolve comments belonging to the matching Item objects.

Notice the second arcql parameter here as well. It is entirely optional but means that you can perform a sub-query on comments that match the items that are being returned.

With this, Hypi gives you the powerful capabilities of being able to do JOIN AND sub-queries, in a simple, concise and clear way…it even marshals them for you, a luxury afforded by GraphQL.

In a follow up post, we will discuss modelling with Hypi.
We thank GridGain who organised the meetups and Oracle who also presented and demonstrated their new Timesten + Kubernetes integration.

Good Enough: The Enemy of Innovation

The building of modern technology, application development, and all that goes along with it is anchored in the idealization of innovation. For years, the web and mobile application development has rooted itself in the idea of pushing things forward, building a better world for the people living in it.

Not too many years ago, the market for technology left the boardrooms of large enterprises companies and entered the consumer market. It’s interesting to consider that such things as smart phones and portable devices, tablets and even laptop computers have only become a major part of life for the general public over the past two decades. At the beginning of the 21st Century, we were at a boom for the use of home computing in industrialized countries. Now, around the world, people have handheld mobile devices in the remotest of regions.

This could all be considered a metric for how innovation in technology has brought the world closer together. The idea that creators and developers have stood so long on the shoulders of giants to build the technical connections we have could be viewed as a strong case for pushing the needle forward.

But are we moving in a positive direction? What have been the major technological breakthroughs a person on the street can benefit from? How do the applications we use regularly innovate? Do they at all?

Unfortunately, this doesn’t seem to be the case.

Just a few years ago, there was a boom in start-ups and tech companies building solid, reliable software and hardware devices. The focus was on making things better and more convenient. Sadly, it seems the focus has changed to making a Minimum Viable Product, something “good enough” to put out in the market, without much thought as to the value it brings to the people who will use it.

The “good enough” stamp on applications and products often means they are rarely fully functioning, often buggy, and don’t deliver a good user experience. There may be a big jump from early adopters, but once the dust settles and the checks clear, the flaws become obvious, and in some cases risk the privacy or data of individuals who were looking for an innovative experience.

This isn’t something recent either. Complacency is an issue that raises its ugly head from time to time. Take a look at our thoughts on the future of Computer Science in 1984 and it’s not difficult to see we work for the now, but innovation can be stifled if there is no reason to push forward.

With the mixture of startups and venture capital, the new get rich quickness of IPO-ing, and pushing forward on fronts an end user never asked for or wanted, innovation seems to have been brushed aside in search of profit. We’ve been seeking the Minimum Viable Product when we should be searching for stability and a product that is usable.

Innovation ends when we can no longer build things people can use. The focus needs to return to the user, or community of users, and away from the money play. Only then can we return to innovation and making the world a better place.

JOINs at scale: Comparing Hypi’s ArcQL to SQL

Introduction

A DAO which uses the Arc io.hypi.arc.codegen.descriptor model to represent GraphQL models/relationships across Ignite caches. Every non-scalar GraphQL type creates one Ignite cache and one Lucene index. Each record of a type creates one lucene org.apache.lucene.document.Document.

Relationships (one to one and one to many)

Relationships are created by creating a lucene org.apache.lucene.document.Document for each link. i.e. if type A has a field b of type B then a org.apache.lucene.document.Document is created in B’s index with two keys
ForeignRef.ARC_FOREIGN_KEY_REF is added which stores the ID of the A and
ForeignRef.ARC_PK_FIELD_NAME is added which stores the ID of the B being referenced
The end result is that two writes are done to B. The first is the ForeignRef and the second is record for B. Since two records are written to B’s index there’s no need for a transfer cache to write references to and pull from in the io.hypi.arc.lucene.LuceneIndexingSpi. The ForeignRef written uses an arbitrary but unique ID for its key so as not to collide with any keys in B’s cache.

Query Generation

As a result of the ForeignRef in B’s index, when search(ReqCtx) is run, it wraps the original query in a org.apache.lucene.search.BooleanQuery that includes a org.apache.lucene.search.BooleanClause.Occur.MUST on the original query and a org.apache.lucene.search.BooleanClause.Occur.MUST_NOT on a generated query which excludes documents in B’s index that have the ForeignRef.ARC_FOREIGN_KEY_REF field.

If the lucene index becomes corrupt or otherwise wrong, an index(ReqCtx, LuceneIndexingSpiWrapper) query must be done on A to rebuild the indices. This can be extremely taxing on resources and should be done with care and rarely.

During resolution of the foreign fields the query is a org.apache.lucene.search.BooleanQuery again that org.apache.lucene.search.BooleanClause.Occur.MUST match the user’s sub-query ForeignRef.ARC_FOREIGN_KEY_REF org.apache.lucene.search.BooleanClause.Occur.MUST match.

Foreign key joins

This implementation makes use of Lucene’s org.apache.lucene.search.join.JoinUtil. First, a query is executed on B’s index to find all foreign references for the current doc. Lucene’s JoinUtil gathers all matching refs into a query.

Second, the query from the join is used in a org.apache.lucene.search.BooleanQuery with a org.apache.lucene.search.BooleanClause.Occur.MUST on B’s index and another org.apache.lucene.search.BooleanClause.Occur.MUST with the user’s sub-query, paging, limits etc. If no user sub-query is given then the second query is a match-all.

Note that this is the same whether the relationship is one to one or one to many. All foreign references work the same.

Joins and sub-queries

Foreign key relationships can be queried with dot syntax e.g a.b.c where a is the source, b is a foreign object being referenced and c is a field on the foreign object. This creates implicit join queries.

The default io.hypi.arc.lucene.ArcQLParser sub-query on foreign keys is ForeignRef.ARC_DEFAULT_FOREIGN_FILTER, which returns the 10 newest entries of B.

One other important factor in the EntityGraph is how sub-queries are done. Two options are either to perform a analysis after io.hypi.arc.lucene.ArcQLParser has generated a query or to generate multiple queries, one for each field as the ArcQL tree is traversed.

In the second model, the io.hypi.arc.codegen.descriptor.SchemaDescriptor is used to understand what is or isn’t a foreign key.
So what does it mean to join? Let’s revise what joins mean in SQL and the types of JOINs that are typically available.

CROSS JOIN

A CROSS JOIN is a cartesian product, i.e. a product as in “multiplication”. The mathematical notation uses the multiplication sign to express this operation: A × B
CROSS JOIN combines all results from one table with all results from another table. If either table is empty then no results i.e times anything by 0 is 0 since it’s a cartesian product

INNER JOIN

(Equi when = is used and Theta when other comparators are used)
Builds on a CROSS join and allows filtering the rows by some predicate. So it’s a filtered CROSS JOIN as such if either table is empty then there are no results.

NATURAL JOIN and USING keyword

A NATURAL JOIN takes columns from both tables with the same name and uses them with the = comparator i.e. it is an equi join which saves you having to write on a.x = b.x

OUTER JOIN

LEFT OUTER JOIN

Similar to INNER join except, if no results are found in the right table it returns null for the rows which are missing instead of no rows at all. A left outer join is an INNER JOIN with a UNION

RIGHT OUTER JOIN

Exact opposite, retains rows from the right table even if there are no matching rows in the left table.

FULL OUTER JOIN

Retains rows from both tables with null on either side where there is no matching row on the other.

SEMI JOIN

(doesn’t have key word in ANSI SQL)
Achieved using an exist with a sub-query, the same query can be changed to use the IN keyword to get the same results SELECT * FROM actor a WHERE EXISTS ( SELECT * FROM film_actor fa WHERE a.actor_id = fa.actor_id )
Find actors that have been in at least one film i.e. we want the actor rows but not the film rows its just that there happens to be a relationship between them.

Others

ANTI JOIN, opposite of SEMI JOIn achieved using NOT IN but be careful as

NOT IN is not equivalent to NOT EXIST LATERAL JOIN… MULTISET

EntityGraph LEFT OUTER JOIN

Hypi implements an equivalent to SQL LEFT OUTER JOINs i.e. if no records are found in the referenced type results are returned from the left and the field is returned as an empty array or null for one to one references.
By Example
Take this as the model for the examples.

GraphQL Query

ArcQL

Selecting primitives only, no join is done

In frist example, don’t select image repeat example

next select image and show it adds a left join on Image in next example

select tweets.media.url and show what that join looks like

next do AND and OR with 1 to N single fields,
and single field to group
and group to group

see if we can arrive at a syntax that leads to 1 or more sub queries on a field with non-ambiguous way of doing OR and ANDs

potentially a way to force a field assertion to be treated as an INNER JOIN

instead of LEFT OUTER JOIN so in a.b = 1 on vals with ref to b = 1 are returned from left table
findUser
username = ‘courtney’
SELECT * FROM User WHERE username = ‘courtney’ ORDER BY A.hypi.created LIMIT n
1.0

findUser
tweets.content ~ ‘low code’
SELECT * FROM User LEFT OUTER JOIN B ON A.hypi.id = B.hypi.id WHERE B.i = 10 ORDER BY B.hypi.created LIMIT n

Introduction to Clojure

Clojure is a modern dialect of Lisp designed to run on the Java Virtual Machine and the Common Language Runtime. Descending from Lisp, Clojure has a strong emphasis on functional programming and a philosophy of treating code as data. Learning Lisp will challenge you to think about programs and programming in a new way from procedure oriented languages. Clojure is a very accessible dialect of Lisp that runs on most computers.

This gentle introduction assumes that you are coming from a procedural language language like C or Java with little or no exposure to Lisp.

The REPL

If you’re approaching Clojure from a compiled language like C or Java, the Clojure REPL might be an unfamiliar way of working with a language. REPL stands for Read Evaluate Print Loop is an interactive programming environment which allows you to develop Clojure code one expression at a time. We’ll start demonstrating Clojure through the Clojure REPL tool clj.

To run our sample code, you will need to install Clojure locally or you can run it from a online interpreter. Instructions for installing Clojure locally can be found on the Getting Started page at Clojure.org. If you would prefer to start with an online interpreter, you can try repl.it or any of the other sites that come up following a search.

Assuming that you are running your code locally, launch clj from the command line. You should see a prompt similar to:

The prompt tells us we are running version 1.9.0 of Clojure and that you can now start typing expressions for the interpreter to evaluate. (You can press Control + D at any time to quit.) We’ll start by typing in some basic mathematical expressions in Clojure.

Clojure Expressions

Our basic math expressions consist of a prefix operator and a dynamic number of argument. At first glance, this might look like some simple syntactic difference between Clojure and C-like languages, but the differences run a bit deeper. In C-like languages, programs consist of statements and expressions. Statements have side-effects but don’t have a value, only expressions have a value. Clojure programs are built entirely out of expressions. Clojure expressions consist of a list (marked by parenthesis) consisting of a function and a series of 0 or more arguments.

More complex expressions can be built up by combining expressions into larger expressions:

In Clojure, the primary data structure used in the language is lists. We’ve already seen the use of lists to construct expressions, we can also store data in lists and manipulate our data using functions:

In the two expression above, we treat our list as a data object by using quote (‘). Quote is a function that returns the unevaluated form of the list – the interpreter doesn’t try to evaluate 1 as a function. Here we use the function first to get the first element of the list and rest to get the remainder (as a list).

Function Definition and Special Forms
Clojure allows you to create a function using the defn special form. You’ve already seen one example of a special form (quote), defn is another. A special form is an expression that has its own evaluation rule. For now, you can think of special forms as the built-in syntax of the language that you can use for developing programs. Following ancient programmer tradition, let’s create a hello world function in Clojure:

Our function definition is event an expression that evaluates to a value. It starts with the name (hello-world) followed by a list of arguments (in this case none) and an expression to perform our computation. In this case, which is slightly unusual, our expression has a side effect which is to print the message “Hello World!”.

Now, let’s create a function add-one that increments a value:

Our expression consists of a name (add-one) , a single parameter (x) and an expression representing our computation. Notice that even defn evaluates to a value, namely the symbol for the function. We can call our function once it’s defined in another expression:

Now let’s go back and try running our hello-world function:

Notice we get a slightly different result, nil. In Clojure, every expression evaluates to a value, so if a function returns no value that is represented by nil. Nil is the absence of value

Clojure has additional special forms that allow you to build up more complex expressions.

The if form allows us to build expressions that return different values based on a test condition. Clojure has a built-in function zero? that tests if a value is zero. We can write our own version of that function:

It is very important to remember that if evaluates to the first expression if the test is true and the second expression if the test is false. Those expressions could be far more complex than just true or false.

The let special form allows us to create a local binding (similar to a local variable) of the result of an expression to a symbol:

With let, we define a sequence of bindings between symbols and the result of an expression. Notice here we use let in the context of a simple expression, you don’t necessarily have to use it within a function.

Finally, recursive expressions are one of the basic building blocks in Clojure for building repeated computations. We can build a simple recursive factorial function:

Higher Order Functions

Functions in Clojure can be passed as parameters to other functions and used to build complex expressions as well. The built-in functions map, reduce, apply are the main functions used with functional arguments. Each of these higher order functions operates differently:
The map function applies it’s first argument (the function) to each element of the second argument (list). We can use the map function in conjunction with our add-one function to build a new list with each element incremented by one:

while the reduce function take a two argument function, applies it to the first two arguments in the list then successively applies the next argument in the list to the previous result until a single element is returned:

Finally the apply function applies the named function to the parameter list:

This article touches on the basics of using Clojure – building up expressions from simple expressions, special forms, and higher order functions. Clojure is a rich language that includes a variety of built-in data structures in addition to lists, a package system, and the ability to load and access Java code in the JVM. For more information on working with Clojure, see the Getting Started section on Clojure.org.

IaM: Understanding what can be protected

In a previous post IaM post we introduced IaM in Hypi. The post described that there are a few distinct categories of things that can be protected with the Hypi platform.

  1. GraphQL functions on a per app basis, this protection is Type based
  2. Records created by these functions, this protection is Type:ID based
  3. Non-record assets e.g. images/files, this protection is HTTP URI based

In this post, we will explore the first of these and explain how to go about making use of this. When creating an App, you can create one or more models that Hypi will generate a GraphQL API for. For example:

In doing so Hypi will generate several GraphQL functions for you to use to work with Post types.

On its own, these operations are unprotected i.e. anyone with access to the generated API can call them. In most cases, that is not ideal.

Thankfully, Hypi makes protecting them very easy. In your app definition, simply add a dependency on the Hypi hypi:iam:latest App. If you’re unfamiliar with the syntax, hypi is the publisher realm, iam is the name of the App and latestis the release name (or version) that you’re adding a dependency to.

How does this solve the authorisation problem?
Hypi has support for scripting which takes on two forms

  • Hypi Prime – The Hypi Prime version enables the submissions of JavaScript or Java code that is executed locally to the data
  • Hypi Gamma – Full blown serverless functions that you can package into a Docker container, using any language or framework you want and Hypi will call out to it, passing the appropriate data

Authorisation is implemented with Hypi Prime. Being a core functionality, the platform has builtin support specifically for this.

When the IaM app is added as a dependency of your app, it brings into scope a Hypi Prime function called iam-auth.fn which is a Java_Function defined by the IaM app as a pre-script. Functions in Hypi can be executed as pre or post scripts i.e. before or after any generated function, or in the case of custom GraphQL functions, any script can be called to be executed as the logic behind that GraphQL function.

So that explains a little of how authorisation can be added very simply. Once it’s added, what does it do, what is it authorising?

Before we answer that, Hypi’s authorisation is done with OAuth 2, behind the scenes when an operation is performed, various OAuth 2 actions are performed implicitly, in some cases it is explicit where it needs to be.

A thorough treatment of OAuth 2 is outside the scope of this guide, an introduction to OAuth can be found here, the relevant concepts are

  1. Resource Server – Hosts the protected resources (The Hypi API is the resource server)
  2. Authorisation Server – Verifies the identity and checks if they can do what they’re trying to. To you this is also the Hypi API but internally it is delegated to a dedicated authorisation server.
  3. Resource Owner – Owns the protected resources (e.g. a Post record)
  4. Client – An agent trying to access the protected resource e.g. the browser or a mobile app
  5. Scope – A “bounded context” which limits what can be accessed

Requests to the Hypi generated API uses the standard Authorization header to pass an access token. The script will take this token and ask the authorisation server if that token is allowed to execute the GraphQL function, if the server says yes, the script returns true and the GraphQL function is executed as if nothing had happened, if the server says no, the script raises an exception and returns immediately, nothing else is executed.

How does the authorisation server know if the user the token belongs to is authorised or not?
In the introduction to IaM post, Scope based protection is mentioned.

What Hypi does is, it treats every GraphQL function (that includes fields outside of Query and Mutation objects) as a Scope. From the example above, If your Hypi realm is called publisher and the app is called blog the following scopes are created

Hypi creates a resource server for every app. Any and all resources created are owned by this resource server. Hypi also creates a default clients, one in particular is called web.

To the point, how does Hypi know if access should be granted or not? Recall from the introduction to IaM post that Hypi allows you to define one or more Policy and Permission. A permission binds a policy to either a scope or to a resource.

Hypi creates an “owner” based policy which checks if the user querying the resource is the owner, in which case, access is granted.

A policy does not work on its own however, so there is an associated scope based permission which includes all of the scopes listed above.

With all of that, we’ve finally covered all of the pieces of how Hypi knows whether access should be granted or denied by simply adding the IaM app as a dependency. All it does is check that the user is the owner.

This post covers the simple case, in a follow up post we will explain how this check can be extended to use other policies e.g. if a resource has been shared with another user, they will not be the owner, in which case access would be denied based on what has been described here, however, a composite policy can be created which includes the owner based policy described here and another policy which checks if permission has been granted to the user or not.

Intro Hypi High Performance Distributed In Memory Computing with Apache Ignite

This is part 1 in the series on how Hypi’s implemented.

One of the core features of the Hypi platform is that it allows a publisher to instantly go from a GraphQL model to a scalable, production ready GraphQL API.

Imagine you wanted to build an app like Facebook, an over simplified data model to start with may look like this.

Hypi will generate a CRUD API e.g. from this model some GraphQL functions that become available would include:

The filter parameter in the API refers to a HypiQL filter

Hypi focuses on the relationships in the schema and depends heavily on them to understand what it should generate for you app’s API. 

At the Nov 2018, London In Memory Computing meetup we spoke about how Hypi implements these under the hood.

In particular, we introduced two original optimisation techniques created as part of our CTO’s PhD research (Wormhole traversals and Vertex cascading) and about how these two techniques when used in combination with a custom FMIndex (emphasising the importance of Burrows Wheel Transform).

The talk covered business to theory to ignite approach, much like what’s shown in the following

In this series, we will be writing a post which covers each of these in turn.

This post doesn’t go into details but we hope it at least gets you prepared for the rest of this posts to come in this series. The posts to come will be detailed and will explain how these techniques and technologies work together.

We’re Heading to WebSummit!

After a stunning start to autumn, we’re heading to the largest tech conference in Europe! As part of the Beta program, we are planning on showing the latest Hypi has to offer.

This year at WebSummit, our goal is similar to what it was at Collision – to get tech and developer communities familiar with Hypi. We plan to focus on making our platform ready for everyone! Features you can expect to see include:

  • Generated GraphQL CRUD APIs for Developers
  • The ability to build front-end applications using any language or framework (Angular, React, etc)
  • Simple scripting API (JavaScript/Java) to add custom GraphQL functions without need to go fully serverless when you don’t need to
  • Powerful, role-based access control for the ability to model anything from small apps to enterprise level apps

Come see how we can accelerate development time from 6 months to 6 weeks via our platform. Our “pay as you go” option allows you to take advantage of serverless deployments without integrations with our GraphQL API platform. With an integrated payment platform we also make it easy to accept payments within your application with common payment processing providers like Stripe and PayPal.

Hypi is built to grow with you, with tunable replication for scale and resilience right out of the box. With applications already built in like Hypi Cabinet (cloud storage), document and spreadsheet cloud applications, presentation anywhere capabilities, and more office related cloud software coming in late 2018 and early 2019 – Hypi is the answer to all your organisation’s needs.

We’re looking forward to seeing everyone in Lisbon! Stop by and see us in our booth.

Identity and Access Management AKA Authentication & Authorisation

Overview

Authentication & Authorisation are both done through GraphQL as are all other API requests.

There are several pieces to authentication and authorisation. Some are implicit, some explicit.

To be able to do anything, a user must first be logged in. After logging in, the user receives an access token. From this point onwards, lets assume “subject” refers either to the user or an app acting on behalf of a user.

In order for an app to get data for a user the user must install the application. During the install, if the app being installed uses any other application’s data the user must confirm installation of those applications and their dependencies, if they’re not already installed. If the apps depended on are already installed the user is told that the new app will be able to access data from the apps it depends on.

We may need a mechanism to allow users to deselect some fields that an app cannot access from other apps.

The “authorisation” that the user gives at this stage is largely symbolic.

When an application is installed it is installed to an organisation. We do this so that we can put all data for an organisation into its own namespace. This means an app does not have to design its schema to account for multiple tenants. Each installation can be thought of as a separate instance (although it will be possible to share data across instances in the future).

The user installing the application must have the permission to do so for that organisation.

Authentication

Whe a user signs in, they pass their username + password in graphql. The user token is returned to the client/browser and set as a cookie. This access token a pseudo “requesting party token”, it is not a Keycloak RPT, it’s an actual Keycloak access token. However, we use it as if it is an RPT. This token cannot be used to perform any operations other than to get an app token, any other query with it will fail.

When an app is loaded, it must pass the access token to the API to obtain an app token. The app token can then be used in subsequent requests to perform data operations. No action can be taken with a user access token other than obtaining the app token.

The Hypi shell is optional, if it is used then it will handle access token conversion to app token. By default the shell is used i.e. accessing realm.hypi.app/publisher.app-name loads the shell which sends request to app-name.publisher.app.hypi.app?access_token=<app-token>.

Back to the app shell: When it loads an app, it does most of this implicitly. Since it owns the user token, it obtains an app token for each app it loads. Once it obtains the app token, it passes it to the iframe using postMessage. If an app token already exists for the currently loaded app, it fetches it from the cookie.

App tokens expire when the user token expires. Long lived app tokens are not supported for now.

Authorisation

Authorisation is granular to the field level. Every GraphQL query is executed via a function(field). These functions should all perform different tasks e.g. create, read, update, delete, share etc As a result, we protect the functions and enable definition of policies around them. The list of things that can be protected are:

  1. GraphQL functions on a per app basis, this protection is Type based
  2. Records created by these functions, this protection is Type:ID based
  3. Non-record assets e.g. images/files, this protection is HTTP URI based

Further more, restrictions can be applied on a per Client basis. Every image deployed with a release has a client generated for it. The credentials for the client are mounted into the container at /hypi/auth.json. This will contain an accessToken field specifically for this image that will be given all permissions allowed by that image.

Concepts

Before we go any further, let’s take a step back and clarify the various concepts involved.

  1. Subject – represents the user or system performing an action on behalf of a user.
  2. Resource – The thing being protected e.g. a GraphQL type, records for a type or assets of an app
  3. Policy – contains the knowledge/logic of whether access should be granted or denied.
  4. Permission – associates a policy with a resource
  5. Realm – a mechanism of isolation/namespace
  6. Realm Clients – Clents are agents that act on behalf of a User. Realm Client is specific to a given Realm, i.e. it can only act on Resources within that realm. There are a number of Clients in Hypi. At this point they are created implicitly and can be referenced by name where they are used e.g. in defining client based policies. The types of clients that are created are grouped under:
  7. App client – when a publisher creates an app, each image in the app implicitly creates a client, if that image communicates with Hypi on behalf of a user, it must use its client credentials that Hypi mounts as a file at /hypi/auth.json.
  8. This client can for example have broader permissions than user clients since it will run inside the Hypi network it will be more secure than user clients that run on user devices.
  9. User client – currently one user client is created called web, in the future, Hypi plans to explicitly add mobile, desktop and sensor clients and allow policies to be defined to control their access.
  10. Groups – A mechanism for providing a collection of Subjects, Roles, Policy,Scope and Permissions
  11. Roles – A domain/organisation specific means of labelling a set of Subjects
  12. User – A person (typically) who can have Realm Clients act on their behalf
  13. Policy – Roles/logic that determine if permission is granted or denied on a Resource
  14. Scope – Typically an action that can be performed on a Resource

A Subject owns Resources of which there are a few types:

  1. GraphQL functions – functions called via GraphQL
  2. Records – Data created via GraphQL functions
  3. Assets – Any non-record data e.g. images or other files that belong to an app.

Authorisation in Hypi

A number of authorisation actions are implicit in Hypi. There are primarily three ways permissions are granted:

  1. Hypi automatically generates some implicit permissions which provide default access. By default, Hypi is very permissive, granting access to all users in a realm to everything in the realm by default. It is expected that an administrator will configure this behaviour to provide a more restrictive setup if it is needed.
  2. An app developer defines default and thus implicit permissions for their app as well as suggested roles. When an app is installed these roles become available for app users to use.
  3. User/Subject defined roles and their associated policies.

An example of what is involved may be described like this:

Imagine you are creating a ground breaking new app called Twitter. It allows its users to publish short messages that provides updates/news to the rest of the organisation. The new app allows you to choose to send updates to all users in the organisation, to specific groups or to teams within a branch or subsidiary (your app is for a very large organisation). Finally, you can also send create these short messages and publish them privately so only you can access them(sad times), or share with one or more specific users i.e. direct and “group” messages.

Let’s break these down into concepts involved as far as authorisation is concerned.

  1. Organisation and Subsidiary – this represents a company or subsidiary of a company. In Hypi, this is just a Group. Hypi Groups are optionally hierarchical, when a parent is defined a Group will inherit the settings of its parent i.e. the parent policies will be applied to the children.
  2. Branch, Department, Teams and User Group – these are again Hypi Groups. User Group is distinguished here because this represents what the user would consider a group. For example, the organisation could have social groups e.g. Football Club, these all map to a Hypi Group object, the difference is that the UI presents these in a user friendly way using terms that makes sense for the user. They’re all simply Groups however. By default, on registration, Hypi creates a single Group whose name is the same as the organisation’s name they registered with.
  3. Policy – A policy can be based on
  4. Role – role based policy determines if access is granted or denied depending on the roles a user has. Properties of this are:
  5. name: String!
  6. roles : [AuthRoleSpec!]! – list of roles to which the policy applies and whether the role is required for the policy to apply or not AuthRoleSpec is role name + required
  7. clients: [String!] – list of clients to which the policy applies, this allows restricting what kind of data can be viewed depending on how the user is accessing the app e.g. mobile vs web client can have different permissions. These are optional, if not defined the policy applies to all clients
  8. clientRoles: [String!] – list of client specific roles this applies to, optional, if not provided applies to all client roles if any exist
  9. logic: AuthLogic – Positive or Negative, if positive then permission is granted when the user has all the required roles specified by the policy, if negative permission is denied. Typically, role based policies are a good way to define a “catch all” policy for a Group.
  10. Client – client based policies determine if the client being used to access data is allowed. In cases where for example the client is web the app can decide that web clients cannot see certain data thus preventing potential leaks of otherwise secure information. In the same way, an app client can be allowed to see this data because its communication with Hypi is encrypted and not vieable by users. Properties of this are:
  11. name: String!
  12. clients: [String!]! – a list of client IDs e.g. web which the policy applies to.
  13. logic: AuthLogic – Positive or Negative, if positive then permission is granted when the client is any of those provided in the policy, if negative permission is denied.
  14. Time – time based policies take effect during the periods for which they’re specified. e.g. define holiday policies Properties of this are:
  15. name: String!
  16. notBefore: DateTime – (yyyy-MM-dd hh:mm:ss) can be used for example to ensure a file is not viewable before the given date
  17. notOnOrAfter: DateTime – can be used to ensure a file is not viewable after a given date
  18. dayOfMonth: AuthInterval – applies during the given interval, if only start is given then applies only on that day, if both then between those days
  19. month: AuthInterval
  20. year: AuthInterval
  21. hour: AuthInterval
  22. minute: AuthInterval
  23. logic: AuthLogic – Positive or Negative
  24. Aggregated – this type of policies provides a means of combining policies i.e. a policy of policies. Properties of this are:
  25. name: String!
  26. policies: [String!]! – list of policies which this policy aggregates
  27. decisionStrategy: [AuthDecisionStrategy] – defines how the policy arrives at a decision, the options are:
  28. Unanimous – all policies listed must be positive for this policy to result in a positive decision
  29. Affirmative – at least one policy listed must be positive for this policy to result in a positive decision
  30. Consensus – The number of policies that are positive must be greater than those that are negative e.g. if 5 policies are included, at least 3 must be positive for this policy to be positive
  31. logic: AuthLogic – Positive or Negative
  32. Group – A group based policy allows access to be granted or denied based on the group Properties of this are:
  33. name: String!
  34. groups: [AuthGroupOptions]! – a set of groups to which the policy applies, AuthGroupOption is simple the group name and a boolean indicating if the policy extends to children of this group.
  35. logic: AuthLogic – Positive or Negative
  36. User – this type of policy applies on a per user basis, only users listed in this policy are allowed or denied Properties of this are:
  37. name: String!
  38. users: [String!]! – list of usernames to which this policy applies
  39. logic: AuthLogic – Positive or Negative
  40. Permission – Permissions associates a Policy with a Resource or a Scope. For example, sharing a tweet in our application can have permissions applied to the tweet record i.e. the resource or the the permission could be applied to the GraphQL function determining if the user can create a record in the first place or not, here the function would be the scope.
  41. Resource Based – applies to specific resources or to all resources of a specific type Properties of this are:
  42. name: String!
  43. resources: [ID!] – a set of IDs for resources which this permission applies to. These can be added and removed as necessary
  44. type: [String!] – a set of types of the resources that this permission applies to.
  45. policies: [String!]! – list of policies which this policy aggregates
  46. decisionStrategy: AuthDecisionStrategy – defines how the policy arrives at a decision, the options are:
  47. Unanimous – all policies listed must be positive for this policy to result in a positive decision
  48. Affirmative – at least one policy listed must be positive for this policy to result in a positive decision
  49. Consensus – The number of policies that are positive must be greater than those that are negative e.g. if 5 policies are included, at least 3 must be positive for this policy to be positive Internally a permission is duplicated for each resource in the resources or type arrays (maps each to a permission in Keycloak).
  50. Scope Based – scopes are broader than resources and tend to apply to things in an administrative way but can be used for resources too. A scope simple has a name e.g. publisher:App Name:Type:functionName or publisher:AppName:imageName:endPoint A Scope is automatically created for the following things:
  51. GraphQL functions – every function is treated as a scope and enables the creation of permissions on specific functions so use of a function can be granted or denied
  52. Endpoint – when an app is created, each image defines endpoints which are reachable, each endpoint becomes a scope that can be protected with scope based permissions Properties of scope based permissions are:
  53. name: String!
  54. resources: [ID!] – a set of IDs for resources which this permission applies to. These can be added and removed as necessary
  55. scopes: [String!] – a set of scopes this applies to.
  56. policies: [String!]! – list of policies which this permission associates with this permission and thus resources
  57. decisionStrategy: AuthDecisionStrategy – defines how the policy arrives at a decision, the options are:
  58. Unanimous – all policies listed must be positive for this policy to result in a positive decision
  59. Affirmative – at least one policy listed must be positive for this policy to result in a positive decision
  60. Consensus – The number of policies that are positive must be greater than those that are negative e.g. if 5 policies are included, at least 3 must be positive for this policy to be positive

Models, Data Structures & Compute Resources

Hypi data modelling

Hypi models are defined using GraphQL. For detailed documentation on what GraphQL is, consult the GraphQL documentation. In summary, it is a data definition and query language. It defines a type system used for modelling a domain and functions for working with the models in that domain.

Introduction to GraphQL

For the purposes of Hypi, we will discuss only the GraphQL features supported. GraphQL operations fall under three broad categories queries, mutations and subscriptions. Its type system consists of scalars and composites. Composites are further broken down into output and input types.

GraphQL by example

It can be useful to have a practical example to work with, let’s imagine we’re developing a simple inventory app. We will define the basic models that may be used and use them to demonstrate how it fits in with Hypi.

As a developer of the inventory app you could create a simple model such along these lines:

type Inventory {
name: String!
items: [Item!]!
}

type Item {
name: String!
sku: String
description: String @field(indexed: true)
price: Float!
}

A non-standard GraphQL thing appear in this model. The @field takes an indexed argument, if set to true, the field will become searchable i.e. you will be able to find records by searching the contents of this field. Any number of fields can be indexed. The exact mechanism by which you can search is discussed later.

Hypi will generate several things from this model. By default, Hypi does not generate a Relay pagination API but you can use the @withrelay directive on a type to have it generate a relay and apollo compatible version of your model. From the this model, the following are generated:

type Inventory {
name: String!
id: IDPK
hypi: Hypi
items(first: Int after: String): ItemConnection
}

type IDPK {
value: ID
publisherRealm: String
instanceRealm: String
version: String
}

type ItemConnection {
pageInfo: PageInfo!
edges: [ItemEdge]
}

type ItemEdge {
node: Item
cursor: String
}
#this type is not generated, it is provided by Hypi but included here for completeness
type PageInfo {
hasNextPage: Boolean!
hasPreviousPage: Boolean!
}

type InventoryInput {
id: ID
name: String!
items: [ItemInput!]!
}

type ItemInput {
id: ID
name: String!
sku: String
description: String
price: Float!
}

type InventoryInputOpt {
id: ID
name: String
items: [ItemInputOpt]
}

type ItemInputOpt {
id: ID
name: String
sku: String
description: String
price: Float
}

Query {
getInventory(id: ID!): Inventory
getSomeInventory(ids: [ID!]!): [Inventory!]!
findInventory(filter: String first: Int after: String): InventoryConnection!
}

Mutation {
trashInventory (ids: [ID]): Boolean
deleteInventory(ids: [ID]): Boolean
createInventory(value: InventoryInput): Inventory
updateInventory(value: InventoryInputOpt): Inventory
}

By going through this generated version of the inventory model, it should be clear as to what Hypi is doing.