[PATCH] POST params encoding in Ruby 1.9.1
Reported by Serge Balyuk | June 28th, 2010 @ 10:12 AM
Please find yet another approach to force_encoding
params treatment in ruby1.9.1.
#48 spawned a great discussion. It seems like standards can be broken and browsers can misbehave. But I'm not sure why we should get params encoded in ASCII when specified charset and actual charset do match (i.e. cases when browsers did their work well - this happens pretty often with utf-8).
An additional feature is the
env['rack.force_content_charset']
option which can be
used to override request header setting, so that middleware can do
the trick described by naruse (http://rack.lighthouseapp.com/projects/22435/tickets/48-rackutilsun...)
Utils::unescapse
was patched so it now preserves
encoding of input string for the result (was not the case for
strings containing hex encoded parts).
This patch takes care only of body encoded parameters (POST), because charset parameter of Content-Type describes body. I'm not sure how I should treat params that come from URI query tough. Any comments are welcome.
BTW I had an alternative implementation which performed
set_encoding
on env['rack.input']
. I
liked the idea of having the whole body encoding set according to
request header and then naturally spread it everywhere. Although it
didn't break any existing tests, it still seemed to be unsafe for
the code parsing multipart form submissions. And at the same time
that parsing implementation does not preserve input stream encoding
in resulting hash, so explicit force_encoding
calls
would still be required. So I've dropped that option for now.
Comments and changes to this ticket
-
Serge Balyuk June 28th, 2010 @ 10:21 AM
- Tag changed from encoding, ruby-1.9 to encoding, patch, ruby-1.9
-
Serge Balyuk July 4th, 2010 @ 05:02 PM
BTW discovered that Rails code overrides request
content_type
method inActionDispatch::Http::MimeNegotiation
(master) and changes its semantics: in rails it returns mime type (string value in master and mime type object in 2.3), while in rack it returns header field full value. So rails cuts off content type options, and charset is lost. Generally it's not very good to change method semantics in descendants (LSP and stuff), but it seems like originalcontent_type
andcontent_charset
wasn't used before.I can update patch and add a workaround for this issue, but I'd like to get some feedback first (i.e. is it worth the effort).
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
People watching this ticket
Attachments
Referenced by
- 48 Rack::Utils.unescape problems in Ruby 1.9.1 I've just added a patch #100 that should be addressing so...