I wrote my own JSON parser. Also, JSON is a terrible standard.

preview_player
Показать описание
Link to the code in the gitlab:

Timestamps:
Intro (00:00)
Why JSON? (00:14)
What is JSON? (02:15)
Why is JSON a bad standard? (03:23)
Pothole markers (09:54)
Designing JSENSE (13:49)
Stress testing JSENSE (21:26)
Outro (27:38)

The JSON-C video:

Рекомендации по теме
Комментарии
Автор

21:00 You could potentially use const char* variadic arguments. So in this case you'd call jse_get("file", "json") or jse_get("file.json") depending on whichever you need

BenWeisz
Автор

you actually saved me so much time. I was just going to write a json parser for the same reasons (all libraries I have found are overly verbose), didn't think the standard was so loose, although it is something derived from Javascript so ....

cathalogrady
Автор

7:51 perfect explanation, now I understand C arrays much better

maximood-tired
Автор

25:27 Surprised no one mentioned this yet but that file is "JSON5" and not actually JSON. Biggest change is it supports trailing commas and comments

rabbishekelstein
Автор

While I feel like the video is great, you are able to tell cool things and being interesting.
But there are some issues:

I also implemented a JSON library myself - multiple times, in C, C++... - I use JSON for describing data in some of my projects (in OSDEV, gltf parsing...). The power of JSON is its simplicity, and generally, the size of the parser.
- 3:41 when you say that you would love to have beginning and end markers, we already know when the file is expected to end: when the depth is 0 after parsing the first element. I mean, if you start to see the '{' token, you know that the file has not ended until you have seen the corresponding closing bracket. Also, the error handling should not be inside the JSON data, but should rather be placed outside the management of the standard. I mean, if you open a nonexistent file in an OS, the error is returned in another place, rather than using a fake file that contains 'null'. JSON is used to store errors in an API because it's a whole another abstraction. It's not the JSON parser that returns an error, but the API. I explain later why the degenerates' case makes sense.
The beginning and end markers are also placed outside the JSON standard, for example, in a file system, we already know the length of the file, in a request, we know its length...
- 5:40 The key isn't the only way to access an element, generally it's like a map, you explore it by its key or by exploring each element by itself. Nothing restrain you from using JSON map for storing things in the form of {name}: {data} and so, using UTF-8 seems logical. Also, for a text based data format, it's not a specific part of the document that is UTF-8 but the whole document.
- 7:00 JSON arrays, are implemented in a way that it's easy to implement a recursive JSON parser, for each entry, it's like parsing a whole new JSON document for the current entry. It's just that enforcing a single type for all arrays entry would be slower, because you would have to check each entry type. (for example, A parser that I help to implement a long time ago: (GitHub) ). It seems like you want to map a JSON array structure to a C struct perfectly, and you can do it with the help of unions:

struct JsonValue {
JsonType type;
union {
JsonMap map;
JsonArray array;
float floating;
...
};
};

Then you would use everything as a JsonValue, also your document. That's why a document could have just a value (3:23, 3:16), and why an array could have multiple types.
It's meant to be implemented that way because a 'Value' could be a literal, an object, an array...

You say that it's slower because it doesn't have constraint, but it's false, First, if the speed is the most important component, you should not use JSON but rather a binary format. (But binary formats have their own disadvantages). JSON is aimed at being simple to implement as a recursive parser, and still being easy to edit. And JSON is already fast enough, I mean, at the end, the bottleneck that you encounter isn't your library but the file system/disk itself.




- 11:46 The standard should not contain a minimum value, because it's specific to each implementation. JSON is meant to be used in other specification, or your own. For example, it's used in Javascript. It's used in the GLTF specification, and so on. This means that those specifications may be able to put a value to the JSON limit. Your JSON file is meant to target things it knows in advance (you would not write a JSON file without knowing where it would go ?). The C standard puts a limit on thing (like the variable name length) because it's meant to be used by multiple compilers. But JSON is only generally used to abstract data for another standard.
For example, a GLTF file may be massive, (10M +), so a high limit is important, but for a config file, having a 1M+ limit seems overkill. So it's implementation dependent.
(same things for 2, 3, 4).




- 14:30 I feel like the comparison is a little bit unfair, because your code doesn't have error handling. Technically, the code on the right doesn't handle it, but the library that he uses does.

Have a nice day ! Great work.

cyp_
Автор

about ascii keys, it's cool and pretty understandable you don't want emojis in there but it breaks down if you wanted keys written in a language with non-ascii letters. Contrary to popular belief, not all APIs are in English

oxey_
Автор

JSON is fine. Not being strongly-typed is by design. There are alternative strongly-typed wire format if that's your use case. (e.g. protobuffer)

We are on the real world -- people use different languages, run incompatible software versions, will enter garbage data, have to pay contractors to update their software -- JSON allows for sufficient leeway for that (backwards compatibility, fail-safe).

voidvector
Автор

probably the only good notation is s-expressions. XML, JSON or other stuff all have problems.

deadmarshal
Автор

it is almost like the standard was written by programmers who anticipated writing json parsers, and wanted the "freedom" to change the criteria of there implementation

cathalogrady
Автор

On 6:56 when you say that JSON does not understand arrays I think it is worst than that. You see, JSON does not understand anything right, it is an specification. Arrays in C and low level languages as you say are defined on your words as contiguous memory holding elements of same size to complete. But you see JSON is not a low level language it does not have to use the same definition of array. In the end JSON is just a text file with certain syntactical rules, that is structure. Literals (true; false; null) in JSON are just strings as everything That is why it is so funny to me when people use it as an example of unstructured data which is nothing besides max entropy. But I agree with you that the curse of JSON is too much flexibility and not being able to know better what to expect. On my field of work, this express itself largely as a confusion where people seem to believe that JSON can replace a data model which is a completely different concept. And here is also where constraints come in place! So, yeah agree with you that the lack of defined constraints limit a lot JSON as a data exchange format and causes a large percentage of the headache for most data engineers besides all the business loss derived by the semantic loss given the lack of constraints.

vicsteiner
Автор

why are you including pthreads, I don't see it used in the code

cathalogrady
Автор

So JSON would be so much better if it only supported American English ? The American Standard Code for Information Interchange? Maybe get a globe - there's a whole world out here that isn't in the USA.

pooroldpedro