Why We Hate Java Serialization And What We're Doing About It by Brian Goetz & Stuart Marks

preview_player
Показать описание
Java Serialization is well known to be one of the worst features of Java. It's been mentioned as such in various "Ask the Experts" panels over the past few years. Joshua Bloch's book "Effective Java" describes many dangers of serialization, and it devotes an entire chapter to the topic. But isn't this merely a recent group of Java platform designers complaining about a bad design produced by an earlier group of Java platform designers?

In fact, Java serialization was well designed for its intended purposes. In the 1990s, big topics of the day included transparent persistence of objects and distributed objects. Transparent persistence is the ability to save and restore objects without requiring explicit code in those objects. Distributed objects involves the transfer of objects and interactions among objects over a network, as in Java RMI. In the context of the late 1990s, Java Serialization supported both of these goals quite well.

The problem is that, while most of the industry has moved on from these goals, Java Serialization is still embedded deeply and pervasively into the platform. With most obsolescent features, libraries and applications can simply ignore them, and doing so poses no issues. Java Serialization is different. Code that on its face appears correct and secure might have bugs or security holes caused by the mere presence of serialization in the platform, even if that code doesn't use serialization explicitly. In many cases, it is simply not possible for high assurance code to ignore the possibility of serialization. Historically, and continuing to this day, serialization is the direct cause of many bugs and security holes in Java appplications, libraries, and in the JDK itself. Serialization thus imposes costs across the platform that cannot be ignored.

This talk is neither a tutorial on serialization, nor is it merely a rant about how bad Java Serialization is. (We can't guarantee there won't be ranting, however.) It is instead a thorough analysis of a few of the fundamental aspects of the design of Java Serialization that, in retrospect, can be considered flaws. We will then provide examples of bugs in the JDK that resulted directly from these design decisions. Informed by this analysis, we will then proceed to describe potential new mechanisms we are exploring that may eventually replace the current Java Serialization mechanism. The direction of the new mechanism is to ensure that it is well integrated with the language model, and where necessary, that the language model is enhanced to accommodate serialization.

Another strong direction is to make the mechanisms explicit in the source code. This will make it possible to verify and reason about the correctness and security of a program by examining it, without having to consider "extra-lingual" mechanisms or "magical" behavior exhibited by the current Java Serialization mechanism.
Рекомендации по теме
Комментарии
Автор

Baking serialization into the programming language presumes that there is one canonical way to serialize the data of each object. But there may be multiple different ways to externally represent a given object - different forms to persist to a file on disk, to send to this system, to send to that system. It's still too magical.

tohopes
Автор

Really insightful. Facing the same problem when processing 20+Million messages per hour

NirajSanghani
Автор

Very useful talk on these issues and well overdue. I hope we get another one (albeit likely shorter) on clone() / Cloneable and why that is bad design

nO_dNAL
Автор

45:20 +1/-1 strategy is not a serious commitment anymore, that does not even touch a LTS release

berndeckenfels
Автор

To follow up on the point made by Andy earlier, it feels like this work complects the success of updating Serialization to Pattern Matching unnecessarily. I've done that type of "use improvement B we really want to justify the necessity of implementing feature A" reasoning in my work, so I understand the impetus... But it really feels like this can be accomplished using existant mechanisms. The big insight/evolution here is to use annotations instead of implementing an interface.

For instance, consider the existence of a

public interface SerializationStrategy<T> {
ObjectOutputStream serialize(T instance)
T deserialize(ObjectInputStream stream)
}

Then you can use a ServiceLoader to populate the available SerializationStrategy in the classpath and implement a default dispatching mechanism fairly easily.

At least, in theory. I haven't tried implementing it. But the approach seems straightforward.

Sure, Pattern Matching might make things syntactically nicer, but I'm not sold that it's a blocker or even a huge enabler here.

nicohidalgo-toledo
Автор

Nice presentation. Thank you. In my experience data objects that represent some more complex problem domain usually have direct or indirect cyclic references, typically parent - children, whole - parts etc. That's why both JPA and JAXB offer some solutions such as bidirectional references, @XmlID, @XmlIDREF, public void afterUnmarshal(Unmarshaller unmarshaller, Object parent) etc.

appsofteng
Автор

Isn’t the @Serializer pattern just an out-parameter. A Fortran out-parameter?

ArneBab
Автор

I will say, json frameworks can respect rpivacy and use constructors or factory methods if you tell the library where to look

shadeblackwolf
Автор

I don't like the idea presented in this talk. Why not just use annotation processing that generates the Deserializer and Serializer classes for you?

AndiRadyKurniawan
Автор

Further application of "destructuring" simplifying/fixing frameworksphere - thank you

CasparMacRae
Автор

(Learning from that also tells us that running code while deserialisation might be risky, ask your local collection).

Talk also does not mention the new filter hooks on ObjectInputStream either.

berndeckenfels
Автор

Java should become an ECMA standard.
In this way the process of evolving the platform will become more clear.

gsit
Автор

I started coding when I was 49, which was 3 and a half years ago... I had to figure out a vulnerability fix related to Serialization, so I came here, but I'm pretty sure I will retire before I understand WTF these guys are talking about.. I know ignorance doesn't sell in this world, but damn. I think I'll stick with the friendly confines of React and the relative child's play of hooks.

mh
Автор

When first time to try to send object over socket the i faced this. WTH are they thinking..

mrBrownstoneist
Автор

Good talk. But most importantly, I want their computer's stickers on mine.

witchdodo
Автор

A free solution from the community has already solved most of these problems. MicroStream is a fundamentally new written serialization that enables you to store any Java object graph on disk and load it back to the memory partially very easily, which means you can even update your object graph in the memory. It was created to enable Java to store any kind for any kind of use-cases and to replace heavy-weight DMBS, especially for microservices use cases. It provides high-security deserialization and object graph communication, and it's free.

markuskett