Commit Graph

89 Commits (1f9f86349410f0a008f8d0df9aa66aed60f7a8e9)

Author SHA1 Message Date
isaacs c0e70354db stream: Short-circuit buffer pushes when flowing
When a stream is flowing, and not in the middle of a sync read, and
the read buffer currently has a length of 0, we can just emit a 'data'
event rather than push it onto the array, emit 'readable', and then
automatically call read().

As it happens, this is quite a frequent occurrence!  Making this change
brings the HTTP benchmarks back into a good place after the removal of
the .ondata/.onend socket kludge methods.
2013-08-08 13:01:09 -07:00
Timothy J Fontaine b26d346b57 Merge remote-tracking branch 'upstream/v0.10'
Conflicts:
	deps/v8/test/cctest/test-api.cc
	lib/events.js
	lib/http.js
2013-08-06 11:59:17 -07:00
Eran Hammer 23d92ec88e stream: Fix double pipe error emit
If an error listener is added to a stream using once() before it is
piped, it is invoked and removed during pipe() but before pipe() sees it
which causes it to be emitted again.

Fixes #4155 #4978
2013-08-06 08:15:13 -07:00
isaacs 22c68fdc1d src: Replace macros with util functions 2013-08-01 15:08:01 -07:00
isaacs 993bb93e0a streams: Don't emit 'end' until read() past EOF
This prevents the following sort of thing from being confusing:

```javascript
stream.on('data', function() { console.error('got data'); });
stream.pause(); // stop reading

// turns out no data is available
stream.push(null);

// Hand the stream to someone else, who does stuff...
setTimeout(function() {
  // too late! 'end' is already emitted!
  stream.on('end', function() { console.error('got end'); });
});
```

With this change, the `end` event is not emitted until you call `read()`
*past* the EOF null.  So, a paused stream will not swallow the `end`
event and emit it before you `resume()` the stream.
2013-07-25 13:14:49 -07:00
Ben Noordhuis 0330bdf519 lib: macro-ify type checks
Increases the grep factor. Makes it easier to harmonize type checks
across the code base.
2013-07-24 21:49:35 +02:00
isaacs 0f8de5e1f9 stream: Simplify flowing, passive data listening
Closes #5860

In streams2, there is an "old mode" for compatibility.  Once switched
into this mode, there is no going back.

With this change, there is a "flowing mode" and a "paused mode".  If you
add a data listener, then this will start the flow of data.  However,
hitting the `pause()` method will switch *back* into a non-flowing mode,
where the `read()` method will pull data out.

Every time `read()` returns a data chunk, it also emits a `data` event.
In this way, a passive data listener can be added, and the stream passed
off to some other reader, for use with progress bars and the like.

There is no API change beyond this added flexibility.
2013-07-22 16:17:30 -07:00
isaacs df6ffc018e stream: unshift('') is a noop
In some cases, the http CONNECT/Upgrade API is unshifting an empty
bodyHead buffer onto the socket.

Normally, stream.unshift(chunk) does not set state.reading=false.
However, this check was not being done for the case when the chunk was
empty (either `''` or `Buffer(0)`), and as a result, it was causing the
socket to think that a read had completed, and to stop providing data.

This bug is not limited to http or web sockets, but rather would affect
any parser that unshifts data back onto the source stream without being
very careful to never unshift an empty chunk.  Since the intent of
unshift is to *not* change the state.reading property, this is a bug.

Fixes #5557
Fixes LearnBoost/socket.io#1242
2013-06-03 10:50:04 -07:00
isaacs d5158574c6 stream: Make default encoding configurable
Pretty much everything assumes strings to be utf-8, but crypto
traditionally used binary strings, so we need to keep the default
that way until most users get off of that pattern.
2013-05-14 11:36:05 -07:00
isaacs bdb78b9945 stream: don't create unnecessary buffers in Readable
If there is an encoding, and we do 'stream.push(chunk, enc)', and the
encoding argument matches the stated encoding, then we're converting from
a string, to a buffer, and then back to a string.  Of course, this is a
completely pointless bit of work, so it's best to avoid it when we know
that we can do so safely.
2013-05-14 11:36:04 -07:00
Daniel Moore 3b6fc600e2 stream: make Readable.wrap support empty streams
This makes Readable.wrap behave properly when the wrapped stream ends
before emitting any data events.
2013-05-08 11:59:40 -07:00
Daniel Moore 1ad93a6584 stream: make Readable.wrap support objectMode
Added a check to see if the stream is in objectMode before deciding
whether to include or exclude data from an old-style wrapped stream.
2013-05-08 11:59:28 -07:00
isaacs b0de1e4a41 stream: Fix unshift() race conditions
Fix #5272

The consumption of a readable stream is a dance with 3 partners.

1. The specific stream Author (A)
2. The Stream Base class (B), and
3. The Consumer of the stream (C)

When B calls the _read() method that A implements, it sets a 'reading'
flag, so that parallel calls to _read() can be avoided.  When A calls
stream.push(), B knows that it's safe to start calling _read() again.

If the consumer C is some kind of parser that wants in some cases to
pass the source stream off to some other party, but not before "putting
back" some bit of previously consumed data (as in the case of Node's
websocket http upgrade implementation).  So, stream.unshift() will
generally *never* be called by A, but *only* called by C.

Prior to this patch, stream.unshift() *also* unset the state.reading
flag, meaning that C could indicate the end of a read, and B would
dutifully fire off another _read() call to A.  This is inappropriate.
In the case of fs streams, and other variably-laggy streams that don't
tolerate overlapped _read() calls, this causes big problems.

Also, calling stream.shift() after the 'end' event did not raise any
kind of error, but would cause very strange behavior indeed.  Calling it
after the EOF chunk was seen, but before the 'end' event was fired would
also cause weird behavior, and could lead to data being lost, since it
would not emit another 'readable' event.

This change makes it so that:

1. stream.unshift() does *not* set state.reading = false
2. stream.unshift() is allowed up until the 'end' event.
3. unshifting onto a EOF-encountered and zero-length (but not yet
end-emitted) stream will defer the 'end' event until the new data is
consumed.
4. pushing onto a EOF-encountered stream is now an error.

So, if you read(), you have that single tick to safely unshift() data
back into the stream, even if the null chunk was pushed, and the length
was 0.
2013-04-11 16:12:48 -07:00
isaacs 929e4d9c9a stream: Emit readable on ended streams via read(0)
cc: @mjijackson
2013-03-28 10:27:18 -07:00
isaacs eafa902632 stream: Handle late 'readable' event listeners
In cases where a stream may have data added to the read queue before the
user adds a 'readable' event, there is never any indication that it's
time to start reading.

True, there's already data there, which the user would get if they
checked However, as we use 'readable' event listening as the signal to
start the flow of data with a read(0) call internally, we ought to
trigger the same effect (ie, emitting a 'readable' event) even if the
'readable' listener is added after the first emission.

To avoid confusing weirdness, only the *first* 'readable' event listener
is granted this privileged status.  After we've started the flow (or,
alerted the consumer that the flow has started) we don't need to start
it again.  At that point, it's the consumer's responsibility to consume
the stream.

Closes #5141
2013-03-28 10:27:18 -07:00
isaacs 14947b6c5e stream: Return self from readable.wrap
Also, set paused=false *before* calling resume().  Otherwise,
there's an edge case where an immediately-emitted chunk might make
it call pause() again incorrectly.
2013-03-14 16:43:19 -07:00
Gil Pedersen e8f80bf479 stream: Never call decoder.end() multiple times
Updated version that does what it says without assigning state.decoder.
2013-03-14 16:13:10 -07:00
isaacs 6399839c39 Revert "stream: Never call decoder.end() multiple times"
This reverts commit 615d809ac6.
2013-03-13 15:48:56 -07:00
Gil Pedersen 615d809ac6 stream: Never call decoder.end() multiple times
Fixes decoder.end() being called on every push(null). As the tls module
does this, corrupt stream data could potentially be added to the end.
2013-03-13 15:20:13 -07:00
isaacs 327b6e3e1d stream: Don't emit 'end' unless read() called
This solves the problem of calling `readable.pipe(writable)` after the
readable stream has already emitted 'end', as often is the case when
writing simple HTTP proxies.

The spirit of streams2 is that things will work properly, even if you
don't set them up right away on the first tick.

This approach breaks down, however, because pipe()ing from an ended
readable will just do nothing.  No more data will ever arrive, and the
writable will hang open forever never being ended.

However, that does not solve the case of adding a `on('end')` listener
after the stream has received the EOF chunk, if it was the first chunk
received (and thus, length was 0, and 'end' got emitted).  So, with
this, we defer the 'end' event emission until the read() function is
called.

Also, in pipe(), if the source has emitted 'end' already, we call the
cleanup/onend function on nextTick.  Piping from an already-ended stream
is thus the same as piping from a stream that is in the process of
ending.

Updates many tests that were relying on 'end' coming immediately, even
though they never read() from the req.

Fix #4942
2013-03-10 11:08:22 -07:00
isaacs cd2b9f542c stream: Avoid nextTick warning filling read buffer
In the function that pre-emptively fills the Readable queue, it relies
on a recursion through:

stream.push(chunk) ->
maybeReadMore(stream, state) ->
  if (not reading more and < hwm) stream.read(0) ->
stream._read() ->
stream.push(chunk) -> repeat.

Since this was only calling read() a single time, and then relying on a
future nextTick to collect more data, it ends up causing a nextTick
recursion error (and potentially a RangeError, even) if you have a very
high highWaterMark, and are getting very small chunks pushed
synchronously in _read (as happens with TLS, or many simple test
streams).

This change implements a new approach, so that read(0) is called
repeatedly as long as it is effective (that is, the length keeps
increasing), and thus quickly fills up the buffer for streams such as
these, without any stacks overflowing.
2013-03-10 11:04:48 -07:00
Gil Pedersen 77a776da90 stream: Always defer preemptive reading to improve latency 2013-03-08 20:09:21 -08:00
isaacs 9208c89058 stream: Raise readable high water mark in powers of 2
This prevents excessively raising the buffer level in tiny increments in
pathological cases.
2013-03-06 11:44:30 -08:00
isaacs a978bedee7 stream: Allow strings in Readable.push/unshift
Fix #4909
2013-03-06 11:44:30 -08:00
isaacs b0f6789a78 stream: Remove bufferSize option
Now that highWaterMark increases when there are large reads, this
greatly reduces the number of calls necessary to _read(size), assuming
that _read actually respects the size argument.
2013-03-06 11:44:30 -08:00
isaacs d5a0940fff stream: Remove pipeOpts.chunkSize
It's not actually necessary for backwards compatibility, isn't
used anywhere, and isn't even tested.  Better to just remove it.
2013-03-06 11:44:30 -08:00
isaacs 8c44869f1d stream: Increase highWaterMark on large reads
If the consumer of a Readable is asking for N bytes, and N > hwm,
then clearly we have set the hwm to low, and ought to increase it.

Fix #4931
2013-03-06 11:44:30 -08:00
isaacs 119cbf4854 stream: Don't require read(0) to emit 'readable' event
When a readable listener is added, call read(0) so that data will flow in, up to
the high water mark.

Otherwise, it's somewhat confusing that you have to listen for readable,
and ALSO call read() (when it will certainly return null) just to get some
data out of the stream.

See: #4720
2013-03-04 07:38:32 -08:00
Trevor Norris 75305f3bab events: add check for listeners length
Ability to return just the length of listeners for a given type, using
EventEmitter.listenerCount(emitter, event). This will be a lot cheaper
than creating a copy of the listeners array just to check its length.
2013-03-01 17:36:47 -08:00
isaacs 88644eaa2d stream: There is no _read cb, there is only push
This makes it so that `stream.push(chunk)` is the only way to signal the
end of reading, removing the confusing disparity between the
callback-style _read method, and the fact that most real-world streams
do not have a 1:1 corollation between the "please give me data" event,
and the actual arrival of a chunk of data.

It is still possible, of course, to implement a `CallbackReadable` on
top of this.  Simply provide a method like this as the callback:

    function readCallback(er, chunk) {
      if (er)
        stream.emit('error', er);
      else
        stream.push(chunk);
    }

However, *only* fs streams actually would behave in this way, so it
makes not a lot of sense to make TCP, TLS, HTTP, and all the rest have
to bend into this uncomfortable paradigm.
2013-02-28 17:38:17 -08:00
isaacs 4b67f0be6d stream: Add stream.unshift(chunk) 2013-02-28 17:38:17 -08:00
isaacs 7764b84297 stream: Break up the onread function
A primary motivation of this is to make the onread function more
inline-friendly, but also to make it more easy to explore not having
onread at all, in favor of always using push() to signal the end of
reading.
2013-02-28 17:38:17 -08:00
isaacs 34046084c0 stream: Do not switch to objectMode implicitly
Only handle objects if explicitly told to do so in the options
object.  Non-buffer/string chunks are an error if not already in
objectMode.

Close #4662
2013-02-25 07:38:10 -08:00
isaacs e03bc472f0 stream: Start out in sync=true state
The Readable and Writable classes will nextTick certain things
if in sync mode.  The sync flag gets unset after a call to _read
or _write.  However, most of these behaviors should also be
deferred until nextTick if no reads have been made (for example,
the automatic '_read up to hwm' behavior on Readable.push(chunk))

Set the sync flag to true in the constructor, so that it will not
trigger an immediate 'readable' event, call to _read, before the
user has had a chance to set a _read method implementation.
2013-02-25 07:38:10 -08:00
isaacs 27d1babaae streams: Pre-emptively buffer readables up to the highWaterMark
Also, this adds a test that guarantees that the ordering of several
push() calls in a row is always preserved in synchronous readable streams
2013-02-22 11:24:05 -08:00
isaacs a63c28e6eb stream: Return false from push() more properly
There are cases where a push() call would return true, even though
the thing being pushed was in fact way way larger than the high
water mark, simply because the 'needReadable' was already set, and
would not get unset until nextTick.

In some cases, this could lead to an infinite loop of pushing data
into the buffer, never getting to the 'readable' event which would
unset the needReadable flag.

Fix by splitting up the emitReadable function, so that it always
sets the flag on this tick, even if it defers until nextTick to
actually emit the event.

Also, if we're not ending or already in the process of reading, it
now calls read(0) if we're below the high water mark.  Thus, the
highWaterMark value is the intended amount to buffer up to, and it
is smarter about hitting the target.
2013-02-21 15:23:18 -08:00
isaacs 3b2e9d2648 stream: remove lowWaterMark feature
It seems like a good idea on the face of it, but lowWaterMarks are
actually not useful, and in practice should always be set to zero.

It would be worthwhile for writers if we actually did some kind of
writev() type of thing, but actually this just delays calling write()
and the overhead of doing a bunch of Buffer copies is not worth the
slight benefit of calling write() fewer times.
2013-02-21 15:23:18 -08:00
Gil Pedersen 0a9930a230 stream: Pipe data in chunks matching read data
This creates better flow for large values of lowWaterMark.
2013-02-15 18:51:22 -08:00
isaacs 1762dd7ed9 stream: read(0) should not always trigger _read(n,cb)
This is causing the CryptoStreams to get into an awful state when
there is a tight loop calling connection.write(chunk) waiting for
a false return.

Because CryptoStreams use read(0) to cycle data, this was causing
the encrypted side to pull way too much data in from the cleartext
side, since the read(0) would make it always call _read.

The unfortunate side effect, fixed in the next patch, is that
CryptoStreams don't automatically cycle when the Socket drains.
2013-02-11 16:43:09 -08:00
isaacs 6bd450155c stream: Empty strings/buffers do not signal EOF any longer 2013-02-11 16:43:09 -08:00
Fedor Indutny c024d2d8c0 streams: both `finish` and `close` should unpipe
Otherwise sockets that are 'finish'ed won't be unpiped and `writing to
ended stream` error will arise.

This might sound unrealistic, but it happens in net.js. When
`socket.allowHalfOpen === false`, EOF will cause `.destroySoon()` call which
ends the writable side of net.Socket.
2013-02-06 20:38:20 +04:00
isaacs a6c18472cd stream: Don't stop reading on zero-length decoded output
Fixes regression introduced in 7e1cf84c9e
2013-01-31 13:33:37 -08:00
isaacs 7e1cf84c9e stream: Don't signal EOF on '' or Buffer(0)
Those values, if passed to the _read() cb, will not signal an EOF.  Only
null or undefined will mark the end of data, and trigger the end event.

However, great care must be taken if you are returning an empty string
or buffer!  There must be some other thing somewhere that will trigger
a read() call, because there will never be a readable event fired later.

This is in preparation for CryptoStreams being ported to streams2, where
it is safe to simply stop reading, because the crypto cycle process will
cause it to read(0) again at some future date.
2013-01-31 11:59:36 -08:00
isaacs 782149ddc3 streams2: Handle sync read callbacks nicely 2013-01-24 07:49:27 -08:00
Raynos 444bbd4fa7 streams: Support objects other than Buffers
We detect for non-string and non-buffer values in onread and
turn the stream into an "objectMode" stream.

If we are in "objectMode" mode then howMuchToRead will
always return 1, state.length will always have 1 appended
to it when there is a new item and fromList always takes
the first value from the list.

This means that for object streams, the n in read(n) is
ignored and read() will always return a single value

Fixed a bug with unpipe where the pipe would break because
the flowing state was not reset to false.

Fixed a bug with sync cb(null, null) in _read which would
forget to end the readable stream
2013-01-24 07:49:27 -08:00
isaacs 14e8f806de stream: Properly handle large reads from push-streams
Problem 1: If stream.push() triggers a 'readable' event, and the user
calls `read(n)` with some n > the highWaterMark, then the push() will
return false (indicating that they should not push any more), but no
future 'readable' event is coming (because we're above the
highWaterMark).

Solution: return true from push() when needReadable is set.

Problem 2: A read(n) for n != 0, after the stream had encountered an
EOF, would not trigger the 'end' event if the EOF was pushed in
synchronously by the _read() function.

Solution: Check for ended in stream.read() and schedule an end event if
the length now equals 0.

Fix #4585
2013-01-16 10:45:11 -08:00
isaacs 20a3c5d09c streams2: Do not allow hwm < lwm
There was previously an assert() in there, but this part of the code is
so high-volume that the added cost made a measurable dent in http_simple.

Just checking inline is fine, though, and prevents a lot of potential
hazards.
2013-01-14 16:03:38 -08:00
isaacs 27fafd4648 stream: Do not call endReadable on a non-empty stream
Say that a stream's current read queue has 101 bytes in it, and the
underlying resource has ended (ie, reached EOF).

If you do something like this:

    stream.read(100); // leave a byte behind
    stream.read(0); // read(0) for some reason

then the read(0) will get 0 from the howMuchToRead function.  Since the
stream was ended, this was incorrectly treating the 0 as a "there is no
more in the buffer", and emitting 'end' before that last byte was read.

Why have the read(0) in the first place?  We do this in some cases to
trigger the last few bytes of a net socket (such as a child process's
stdio pipes).  This was causing issues when piping a `git archive` job
to a file: the resulting tarball was incomplete, because it occasionally
was not getting the last chunk.
2013-01-14 15:22:42 -08:00
isaacs 530585b2d1 stream: Use push() for readable.wrap() 2013-01-10 13:49:53 -08:00
isaacs a993f740f0 stream: Add readable.push(chunk) method 2013-01-10 13:49:53 -08:00