Things to know before implementing authentication in your web application

If you are building a production quality web application , you need to think about security , different types of attacks which can make your application vulnerable and understand different authentication/authorization mechanisms.

I have prepared a list of few concepts which has helped me to understand the bigger picture. Please note that the intention of this post is just to give you basic overview of these terminologies. Some of these topics are huge and you need to go deeper to get a better understanding.

1) Sticky Session vs Non Sticky Session -> If your website is served by multiple web servers (server farm) with help of load balancer then it’s the task of the load balancer to decide which server node each request will go to. If sticky session is configured , all requests of one user will go to the same node( web server) whereas in case of non-sticky session , load balancer may choose any server to serve user request.

2) Cookies vs Local Storage/Session Storage

-> All of these are client side storage solutions. Cookies are age old whereas Local Storage and Session Storage (together called Web Storage) were introduced in HTML5.

a) Amount of Information –> Cookies can store less information than Local/Session storage. Local storage can store upto 5 MB. If you store large amount of data in your cookie , it will impact your website performance. If you have large data to store use Session/Local storage.

b) Lifetime -> Let’s start with web storage first. If you store something in session storage , it’s available only in that session(tab). If you open the same website in another tab in your browser , information stored in session storage is not accessible to another tab. If you want that information to be available across tabs , you should use local storage. Note that session storage information will be lost once you close the browser. All of these client storage solutions can be cleared by user anytime.

Coming back to cookies .There are 2 types of cookies , persistent and non-persistent cookies. If you want a cookie to expire at a specific time then set the expiration date/time on it. Such cookies are called persistent cookies.If you want the cookies to expire when the session ends , don’t set expiration time. These are called non-persistent cookies.

c) Data Type -> Cookie only support string datatype whereas Local/Session Storage supports broader range of data types.

d) Accessibility ->  You can access session/local storage in your angularjs using $window.sessionStorage or $window.localStorage. You can access cookies using $cookies  and $cookieStore.

You can check your Web Storage from chrome Dev tool inside Resources Tab.

e) Security ->

Note that you can’t access/set cookies set by another website as per Same Origin Policy. Also the data that you save in cookie is by default send to the server but it’s not true for Session/Local storage. Infact if you are storing any authentication information , it’s better to use cookies since you can apply HTTP only  and  Secure flag which provides more safety to your data. I will discuss more about HTTP Only and Secure Flag later.

3) Cookie Flags (HTTP Only and Secure) -> You can make the data stored in cookie more secure by using HTTP Only and secure flag. HTTP Only can save your website from XSS attack. If this flag is set in your cookie header , client side (javascript) can’t access cookie information , only server can access it. Note that if you browser doesn’t support HTTP Only flag , it will be treated as a normal cookie.

Secure flag means that cookie will be included only if request is transmitted over secure medium (HTTPS).

4)CSRF/XSRF Attack ->  Cross Site Request Forgery is often confused with XSS(Cross Site Scripting Attack) but these are two different things.

Imagine that you are logged into  your banking site in one tab and you have also opened some evil web page in another tab. This attack happens when evil web page can make a fraudulent request to the secure site since you are already logged in. Attacker can change you settings or transfer money from your account and you/your banking site wouldn’t even notice it till the fraudulent action is completed.

Note that it doesn’t matter if target functionality uses Get or POST. Both are vulnerable. You must have seen that most banking website will log out you after few mins if there is no activity. It could be inconvenient for some users but it’s for our safety.

To prevent from CSRF attack , your server need to send a secret to the browser, which can only be access by Javascript running on the browser and wouldn’t be available to the malicious website in another tab/browser. Any request to the server should include this secret in it’s header to prove it’s authentic.Most of the modern web frameworks provide features to prevent CSRF attack. In angualrjs , $http has a built-in solution for this. For this , your server must set a token in the session cookie called XSRF-TOKEN , on the first GET request by the application. This token must be unique for this session.

In angularjs application , the $http service will extract this token from the cookie and then send this in the header (X-XSRF-TOKEN) with every HTTP request it makes. The server must check this token on every request and block access if this token is not valid.

 

5) XSS

 6) CORS/JSONP -> As a web developer , it’s common to fetch data from third party service but Web Browsers enforce Same-Origin-Policy which means XHR request can only interact with resources originating from same source. Same source is defined as a combination of protocol , host and port.
However there are several techniques for accessing data exposed by external servers:-
a) JSONP (JSON with padding)
b) CORS(Cross Origin resource sharing)
CORS is a better technique and I have seen people using CORS more often than JSONP.With CORS , you need configuration on the server side and a browser that supports CORS. The idea behind CORS is to have a coordination between browser and foreign server to conditionally allow cross domain requests. Browser will send an appropriate request/headers to the foreign server , foreign server will respond and then browser will interpret server responses to complete cross-domain-request.

 

7) Token based vs Session based authentication

 

 

My notes on XSD

Things you should know about XSD
I have listed few things below which might be useful to design your xsd if you are writing contract first web services.

1) Hashmap in XSD -> It’s not possible to create a hashmap in XSD but there are alternatives and one of the alternative is to create a list of pairs , where each pair will have a key and value. There are many other ways to do this but this is an easy and clean way and I am sure it will work perfectly fine in your production system.In this example , PersonToPhoneNumbers is the name of hashmap. It can contain maximum 20 entries. In this hashmap, I have created both key and value as String dataType but you can choose any datatype you want.

<xs:element name="PersonToPhoneNumbers" type="Map" minOccurs="0" maxOccurs="20">
</xs:element>
<xs:complexType name="Map"><xs:sequence>
<xs:element  name="key"  type="xs:string"/>
<xs:element  name="value"  type="xs:string" minOccurs="0" maxOccurs="20" />
</xs:sequence>
</xs:complexType>

2) DateTime in XSD ->If you use xs:dateTime data type in your xsd, by default it will be converted to XMLGregorian Calendar. If you need to convert it to Java.Util.Calendar , you need to add below entry in your binding.xml

<jaxb:globalBindings>
<xjc:serializable/>
<jaxb:javaType name= Java.util.Calendar" xmlType="xs:dateTime" parseMethod="javax.xml.bind.DatatypeConverter.parseDateTime" printMethod="javax.xml.bind.DatatypeConverter.printDateTime" />
</jaxb:globalBindings>

If you don’t want to do this , you can always convert xmlGregorian Calendar to Java.Util.Calendar in your Java code.

Exception Handlng in REST | Exception Handling in REST

So you have already written your first ‘Hello World’ Rest Service and now it’s the time to add exception handling to your service. Poor exception handling can give you lot of pain when you go to production. I have listed some of the best practices that I have learnt from my past projects. In my example , I am using angularjs in the frontend. You can use any framework in backend (CXF , Jersey). I have personally used CXF.

1)Server Side  ->You need to subclass WebApplicationException class in your project and then you need to throw this custom exception from your REST service. Note that WebApplicationException is a part of javax.ws.rs-api.jar , so you need to include this maven entry in your POM.

<dependency>
 <groupId>javax.ws.rs</groupId>
 <artifactId>javax.ws.rs-api</artifactId>
 <version>2.0</version>
</dependency>

Your Custom Exception

public class MyException extends WebApplicationException
{
public MyException()
{
super(Response.Status.NOT_FOUND);
}

public MyException(String message)
{
super(Response.status(Response.Status.INTERNAL_SERVER ERROR).entity(message).type("text/plain").build());
}

public MyException(int responses ,String message)
{
super(Response.status(responses).entity(message).type("text/plain").build());
}

}

2) Client Side – >On Angulalrs Js side , you need to do something similar. Note that whenever you throw WebApplicationException from your Rest Service , control should go to the error block of Angularjs $http service and you should be able to get the status code and error message thrown from REST service.

$http.post("/getUserAccountDetails" , requestedData , http_defaults).then(function(response)
{
return response;
},
function ERROR)
{
// control will come here
});

3) If you are using contract first web service then there is another way to do exception handling in REST service with Angularjs. Let’s say you don’t want to execute the error block in Angularjs instead you want the control to come to success block and inside your success block you will have some kind of fork to separate success and error conditions. In that case , you shouldn’t throw WebApplicationException from your REST service instead you should return a response and in the response header you can return codes to differentiate between success and error conditions.

Server Class vs Client Class

In Java some machines are called ‘Client’ Class and some are called  ‘Server’ Class. It’s important to understand these 2 terms if you are dealing with Java Performance Issues.

Client Class machines are any 32-bit JVM running on a machine with one CPU (despite of the Operating System) and any 32-bit JVM running on Microsoft Windows(despite of the number of CPUs on that machine).All other machines which includes all 64-bit JVMs are called Server Class.

It’s also important to note that if you have 32-bit operating system then you must use a 32-bit JVM. If you have a 64-bit operating system then you can either choose 32-bit or 64-bit JVM.

Garbage Collector Basics

When I first read about Java Garbage Collector , I had so many questions in my mind like why the heap is divide into multiple generations(why this design) , what is stop-the-world , why there are multiple garbage collector algorithm and how do I know which one is best for my application.Why can’t they just create one best algorithm. If you already know the answers for all these questions , you can stop reading further :).

As we all know there are 4 Garbage Collectors available in Java.
1) Serial Collector
2) Parallel Collector (Throughput Collector)
3)Concurrent Collector (CMS)
4)G1 Collector

Now the question is which one is best for your application? As we all know GC job is to find objects that no longer have any references to them and freeing the memory linked to those objects. But reference counting is not sufficient. Consider the scenario of a circular linked list where each object in the list will be pointed by another object in the list and if nothing refers to the head of the list then the list is not in use and should be freed. So references can’t be tracked just by keeping count instead Garbage Collector must periodically search the heap for unused objects.

Now in the process of freeing memory and using it for future allocations , garbage collector also needs to consider memory fragmentation. This is where different garbage collector algorithm differs from each other in terms of implementation. So during this fragmentation process when GC tries to compact the heap by moving around objects in memory , it must make sure that the application threads are not using those objects because memory location of these objects will change during this operation and hence no application thread should be accessing these objects.

This particular event when all the applications threads are stopped is called ‘Stop-the-world’ and it has major impact on the performance of an application. A developer’s task is to minimize this pause by tuning GC.

Garbage Collector divides the heap into multiple generations.  These are called the young generation and old generation. Young generation is further divided into Eden and Survivor spaces. When young generation fills up , GC will stop all the application threads and cleans up the young generation. Objects that are not in use are discarded and objects that are still in use are moved to next generation. This process is called minor GC.

Let’s try to understand why multiple generations are required. The morale behind having separate generation is that many objects are used for a very short period of time  There are 2 benefits of this design approach.
1)Since generation is only a part of the entire heap ,processing it is faster than processing the entire heap. This means that the application threads are stopped for a much shorter period of time in comparison to if entire heap was processed in one go.
2) Since objects are always created in Eden space and after Garbage collection all objects in Eden are either moved or discarded. All live objects are moved to next generation or space and entire Eden space is automatically compacted when Garbage Collection happens.

As we all know there are 4 Garbage Collector Algorithms. The question is how similar they are and where do they differ.

Let’s talk about similarities first:-
1)All GC algorithms divide the heap into young and old generations.

2) All GC algorithm uses stop-the-world approach to clear objects from young generation (Minor GC)

GC Algorithm have their biggest differences in the way they clean up old generation (Full GC).
The simpler algorithms(Serial and Throughput Collector) stop all application threads , find unused objects , free their memory and then compact the heap. It will generally cause a long pause for the application threads.

The complex algorithms(CMS and G1) will find unused objects without stopping all the application threads.
When using these algorithms , application will experiences fewer and shorter pauses. The only disadvantage is that application will use more CPU overall.

So your choice of GC algorithm depends a lot on the type of your application and the performance level you are expecting.

String Performance

If you are working on improving performance of your application , String concatenation is one area which can give you significant performance improvement.Actually it is safe to use string concatenation if string concatenation can be done in one line. For multi-line string concatenation , you should always use StringBuilder. For example:


String str= love + "." + yourJob;

Java Compiler will eventually translate this statement into this code.


String str = new StringBuilder(love).append(".").append(yourJob).toString();

This indicates that one line concatenation of String shouldn’t cause any Performance Issues. The problem arises with multi-line string concatenation. For example:


String str = love;
str += ".";
str += yourJob;

This code will get translated into:


String str = new StringBuilder(love).toString();
str = new StringBuilder(str).apend(".").toString();
str = new StringBuilder(str).append(yourJob).toString();

Notice that this time 3 StringBuilder object got created instead of one. This is very inefficient. Never use string concatenation unless it can be done in a single line.