Wednesday, November 21, 2007

XmlSerializer problem

I encountered a problem with XmlSerializer these days.

What it does is that it serializes OK a string with a character from range 0x01-0x20, but at deserialization it throws exception:

{"There is an error in XML document (6, 14)."}
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle)
at System.Xml.Serialization.XmlSerializer.Deserialize(Stream stream)

It reproduces as:

Create a C#.Net class:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.Xml;
using System.Xml.Serialization;
namespace whatever
{
[Serializable]
[XmlRoot("FooRoot")]
public class Foo
{
static XmlSerializer internalSerializer = new XmlSerializer(typeof(Foo));

public byte[] Serialize()
{
using (System.IO.MemoryStream MS = new System.IO.MemoryStream())
{
internalSerializer.Serialize(MS, this);
return MS.ToArray();
}
}
public static Foo Deserialize(byte[] val)
{
using (System.IO.MemoryStream MS = new System.IO.MemoryStream(val))
{
try
{
object obj = internalSerializer.Deserialize(MS);
return (Foo)obj;
}
catch (Exception ex)
{
throw new Exception("Whoops", ex);
}
}
}



object _internalContainer;
[XmlElement("OBJInside", typeof(FooWhateverSerializableObject))]
public object ObjectContainer
{
get
{
return _internalContainer;
}
set
{
_internalContainer=value;
}
}
}
}

Now add the code in main:
Foo foo= new Foo();
FooWhateverSerializableObject fwso=new FooWhateverSerializableObject();
fwso.MessageText="dsd"+System.Text.Encoding.UTF8.GetString(new byte[]{6}) +"";
foo.ObjectContainer=fwso;

byte[] fooB=MM.Serialize();
string s = System.Text.Encoding.UTF8.GetString(fooB);//here it spits out character 0x06 as "" meaning it escapes the unpermitted char.

Message M2=Message.Deserialize(fooB);//here crashes, due to the "bug"


What you should do to fix it?

Replace:

using (System.IO.MemoryStream MS = new System.IO.MemoryStream(val))
{
object obj = internalSerializer.Deserialize(MS);
}

with:

using (System.IO.MemoryStream MS = new System.IO.MemoryStream(val))
{
using (XmlTextReader XTR = new XmlTextReader(MS))
{
object obj = internalSerializer.Deserialize(XTR);
}
}

meaning you pass a XMLReader(XMLTextReader) instead of a Stream to the XmlSerializer.Deserialize method.

Why?
Because the method Deserialize(stream), analyzed with Reflector, does:
public object Deserialize(Stream stream)
{
XmlTextReader xmlReader = new XmlTextReader(stream);
xmlReader.WhitespaceHandling = WhitespaceHandling.Significant;
xmlReader.Normalization = true;
xmlReader.XmlResolver = null;
return this.Deserialize(xmlReader, (string) null);
}

while Deserialize(XmlReader) calls Deserialize(xmlReader, string) and skips setting Normalization property to true for the XMLTextReader (default is false).

Conclusion
I don't know wheter this is a bug or it's by design, but clearly it wasted some time for me. As the final advice, don't pass a stream to the XMLSerializer.Deserialize method, but a XMLReader instead.

No comments: