<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">2767-0279</journal-id>
<journal-title-group>
<journal-title>Glossa Psycholinguistics</journal-title>
</journal-title-group>
<issn pub-type="epub">2767-0279</issn>
<publisher>
<publisher-name>eScholarship Publishing</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5070/G60114911</article-id>
<article-categories>
<subj-group>
<subject>Regular article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Watch your tune! On the role of intonation for scalar diversity</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1578-0938</contrib-id>
<name>
<surname>Ronai</surname>
<given-names>Eszter</given-names>
</name>
<email>ronai@northwestern.edu</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-7920-9071</contrib-id>
<name>
<surname>G&#246;bel</surname>
<given-names>Alexander</given-names>
</name>
<email>alexander.gobel@manchester.ac.uk</email>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Northwestern University, US</aff>
<aff id="aff-2"><label>2</label>University of Manchester, UK</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2024-10-03">
<day>03</day>
<month>10</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>3</volume>
<issue>1</issue>
<elocation-id>26</elocation-id>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2024 The Author(s)</copyright-statement>
<copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://glossapsycholinguistics.journalpub.escholarship.org/articles/10.5070/G60114911/"/>
<abstract>
<p>Recent research has highlighted that lexical scales vary in their likelihood of giving rise to a scalar inference &#8211; a finding labeled scalar diversity. The current paper examines the role of intonation for this phenomenon, which has thus far primarily been studied using written materials. A specific focus in this regard was on the so-called rise-fall-rise contour, which has been argued to (i) convey uncertainty, which could have an influence on scalar inference calculation, and (ii) be sensitive to properties of lexical scales, which could interact with factors driving scalar diversity. Experiment 1 combined production with an inference task to assess the likelihood of different intonational contours, as well as how a given contour affects scalar inference rates. Production of the rise-fall-rise varied across lexical scales, as expected, and led to an increase in scalar inference derivation relative to a fall. The latter finding was further confirmed in Experiment 2, which explicitly manipulated intonational contours in the inference task. The results, thus, show the importance of taking intonation into account when studying scalar diversity and scalar inference more generally, and they also have implications for theories of the rise-fall-rise contour. Additionally, the experiments revealed a contour that is prosodically similar to the so-called Contradiction Contour, but appears to serve a different pragmatic function.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>1. Introduction</title>
<p>The investigation of <italic>scalar inferences</italic> (SIs), such as strengthening <italic>some</italic> to <italic>some but not all</italic>, constitutes a well-established testing ground for our understanding of pragmatic reasoning. Due to its ubiquity and tractability, SI is the phenomenon most commonly and comprehensively treated in competing theories of pragmatic mechanisms; these mechanisms, in turn, have been subjected to extensive experimental testing, probing SI&#8217;s status at the interface of grammar, semantics and pragmatics (<xref ref-type="bibr" rid="B11">Cummins &amp; Katsos, 2019</xref>). A recent line of research in this domain has focused on how findings about SIs generalize to other scalar terms beyond the stereotypical cases of <italic>some</italic> and <italic>or</italic> (i.a., <xref ref-type="bibr" rid="B25">Gotzner et al., 2018</xref>; <xref ref-type="bibr" rid="B53">Sun et al., 2018</xref>; <xref ref-type="bibr" rid="B57">van Tiel et al., 2016</xref>). This research has found that lexical scales vary greatly in their potential to give rise to SIs, covering almost the entire spectrum &#8211; a finding referred to as <italic>scalar diversity</italic>. A notable commonality among experimental studies conducted in this domain is that they rely on written stimuli: participants read sentences containing scalar terms silently to themselves before providing their response indicating whether they derived an SI. Crucially, there is a considerable amount of evidence supporting the idea that prosodic structure is assigned even during silent reading (<xref ref-type="bibr" rid="B1">Bader, 1998</xref>; <xref ref-type="bibr" rid="B18">Fodor, 2002</xref>; see also <xref ref-type="bibr" rid="B19">Frazier &amp; Gibson, 2015</xref>). As a result, participants are, in principle, able to project whatever intonation they choose onto the stimuli, which may affect their final response.<xref ref-type="fn" rid="n1">1</xref></p>
<p>This issue becomes of particular importance in light of research on the meaning of intonational contours. Specifically, the so-called rise-fall-rise contour (RFR; <xref ref-type="bibr" rid="B60">Ward &amp; Hirschberg, 1985</xref>) is often discussed in relation to examples of non-maximal scale items, such as those used in scalar diversity studies (e.g., <xref ref-type="bibr" rid="B10">Constant, 2012</xref>), as the contour has been claimed to be infelicitous otherwise (<xref ref-type="bibr" rid="B21">G&#246;bel &amp; Wagner, 2023</xref>). This makes it a likely contour choice for these types of contexts. Additionally, different accounts of the meaning of the RFR predict an effect on SI calculation, although the precise nature of this effect depends on the account in question. Finally, there may be additional restrictions on the use of the RFR with respect to properties of the relevant scales (see <xref ref-type="bibr" rid="B21">G&#246;bel &amp; Wagner, 2023</xref>). Specifically, the felicity of the contour may vary across the range of scalar terms that are typically used in research on scalar diversity. As a result, participants in scalar diversity studies may be more likely to produce an RFR for certain scales, which may, in turn, affect their likelihood of deriving an SI. Crucially, in this scenario, the properties of lexical scales would only play an indirect role, being mediated by intonation, rather than affecting SI rate directly.</p>
<p>Here we present two experiments investigating this issue in more detail. Experiment 1 uses a combination of a production and an inference task to assess both how likely participants are to produce a certain contour in a given context and how the choice of contour affects the likelihood of drawing an SI. Experiment 2 presents participants with a given contour directly and, again, assesses the likelihood of an SI. The results show that the production rate of the RFR varies strongly across lexical scales and that its use &#8211; both in production and perception &#8211; leads to an increase in SI rate. Thus, the experiments provide strong evidence for the relevance of intonation for the study of SIs generally (in line with, i.a., <xref ref-type="bibr" rid="B24">Gotzner, 2019</xref>; <xref ref-type="bibr" rid="B55">Tomlinson et al., 2017</xref>) and scalar diversity specifically. Additionally, the findings bear on accounts of the RFR as well as another contour revealed in the production study, which we refer to as the Concession Contour.</p>
<p>The rest of this article is structured as follows. We first provide background on prior research on scalar diversity (2.1), the role of intonation for SI (2.2), accounts of the RFR (2.3), and existing studies of the RFR-SI relationship (2.4). Section 3 details Experiment 1 (production + inference task), and Section 4 details Experiment 2 (perception + inference task). Section 5 offers discussion of our findings in light of the literature on intonational contours and SI. Section 6 concludes.</p>
</sec>
<sec>
<title>2. Background</title>
<sec>
<title>2.1 Scalar inference and scalar diversity</title>
<p>SI represents one of the classic examples of pragmatic enrichment. An utterance containing the quantifier <italic>some</italic> (1), for example, is often enriched from its lower-bounded meaning (1a) to the upper-bounded meaning <italic>some but not all</italic> (1b).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(1)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Miriam caught some of the mice.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Miriam caught at least some of the mice.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;literal</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Miriam caught some, but not all, of the mice.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SI-enriched</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>While there are many different theoretical proposals as to how SIs arise, a standard (neo-)Gricean account posits the following. Hearers assume that speakers are following the Maxim of Quantity (<xref ref-type="bibr" rid="B26">Grice, 1967</xref>), and are therefore trying to be as informative as is required in the context. A more informative alternative utterance to (1) would have been <italic>Miriam caught all of the mice. Informativity</italic> can be defined as asymmetric entailment: <italic>Miriam caught all of the mice</italic> entails <italic>Miriam caught some of the mice</italic>, but not vice versa, hence, the former is more informative (<xref ref-type="bibr" rid="B32">Horn, 1972</xref>). Therefore, when a comprehender encounters an utterance like (1), they reason about the speaker&#8217;s intention behind not uttering the more informative, stronger alternative statement. This may have happened because the stronger alternative is false, and the speaker chose not to utter it in order to avoid violating the Maxim of Quality. This reasoning process leads hearers to derive the negation of the unsaid alternative (<italic>Miriam didn&#8217;t catch all of the mice</italic>) which, combined with the original utterance&#8217;s literal meaning (1a), results in the SI-enriched meaning (1b).</p>
<p>While the <italic>some but not all</italic> SI, based on the &lt;<italic>some, all</italic>&gt; lexical scale, is the most widely discussed example, SI can also arise from other pairs of lexical items that form a scale. The example in (2), for instance, is based on the &lt;<italic>happy, ecstatic</italic>&gt; scale.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(2)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>The winner is happy.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>The winner is at least happy.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;literal</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>The winner is happy, but not ecstatic.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SI-enriched</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Given (neo-)Gricean assumptions, the <italic>happy but not ecstatic</italic> SI can be derived in the same way as <italic>some but not all</italic>. Hearers of the weaker utterance in (2) can reason that the speaker did not utter the more informative alternative <italic>The winner is ecstatic</italic>, because it would not have been true. The weaker utterance&#8217;s literal meaning (2a) and the negation of the stronger alternative, then, together give rise to the SI-enriched meaning (2b). But while the mechanism underlying these two different SIs is posited to be the same, they do not arise equally robustly: hearers are much more likely to enrich <italic>some</italic> to mean <italic>not all</italic> than <italic>happy</italic> to mean <italic>not ecstatic</italic> (<xref ref-type="bibr" rid="B47">Ronai &amp; Xiang, 2024</xref>). In fact, <italic>scalar diversity</italic> is now a well-replicated finding. This term refers to the substantial variation across different lexical scales in the likelihood that they would lead to SI. In van Tiel et al.&#8217;s (<xref ref-type="bibr" rid="B57">2016</xref>) highly influential study, for instance, the rate at which participants calculated SIs ranged from 4% to 100%, with the 43 scales tested spanning that full range (see also earlier work by <xref ref-type="bibr" rid="B2">Baker et al., 2009</xref>; <xref ref-type="bibr" rid="B6">Beltrama &amp; Xiang, 2013</xref>; <xref ref-type="bibr" rid="B16">Doran et al., 2012</xref>).</p>
<p>Existing experimental studies of scalar diversity have concentrated on answering the question of what can explain the observed inter-scale variation in SI calculation. How likely a scale is to lead to SI has been related to various properties of the stronger alternative (e.g., <italic>all, ecstatic</italic>), or of the relationship between the weaker scalar term (e.g., <italic>some, happy</italic>) and that alternative. For example, van Tiel et al. (<xref ref-type="bibr" rid="B57">2016</xref>) have shown that the more distinct the weaker and the stronger term are, the more likely they are to lead to SI. This is because the stronger alternative needs to be sufficiently distinct from the weaker term for SI to arise; if the two terms are not distinct enough, the speaker&#8217;s non-utterance of the stronger term is not necessarily due to its falsity. Here, distinctness was operationalized as semantic distance (as measured in a rating task) and boundedness (whether the stronger alternative is endpoint-denoting). Westera and Boleda (<xref ref-type="bibr" rid="B63">2020</xref>) have proposed that semantic relatedness, based on distributional semantics, is another component of distinctness, which they indeed found to be negatively correlated with SI rates. A property of stronger alternatives that has been shown to predict scalar diversity is how expected they are or, in other words, how (un)certain hearers are about the identity of the relevant stronger alternative, given the weaker term uttered &#8211; the greater the uncertainty, the less likely SI is to arise (<xref ref-type="bibr" rid="B34">Hu et al., 2022</xref>, <xref ref-type="bibr" rid="B33">2023</xref>; see also <xref ref-type="bibr" rid="B46">Ronai &amp; Xiang, 2022</xref>). Concentrating on SIs arising from adjectival scales in particular, Gotzner et al. (<xref ref-type="bibr" rid="B25">2018</xref>) have related scalar diversity to the underlying scalar semantics of adjectives. Polarity was one relevant predictor, with the authors&#8217; results revealing higher SI rates for negative adjectives (e.g., &lt;<italic>bad, awful</italic>&gt;) than positive ones (e.g., &lt;<italic>good, great</italic>&gt;); see also Pankratz and van Tiel (<xref ref-type="bibr" rid="B41">2021</xref>) for a replication using different diagnostics for polarity. Another factor from adjectival semantics is extremeness: extreme adjectives (e.g., <italic>excellent, huge</italic>) have been shown to lead to lower SI rates (<xref ref-type="bibr" rid="B6">Beltrama &amp; Xiang, 2013</xref>; <xref ref-type="bibr" rid="B25">Gotzner et al., 2018</xref>). Aside from deriving across-scale variation from properties of the weaker term and its stronger alternative, studies have also suggested that the propensity for SI is linked to another type of semantic process or pragmatic inference that is variable across scales (<xref ref-type="bibr" rid="B25">Gotzner et al., 2018</xref>; <xref ref-type="bibr" rid="B53">Sun et al., 2018</xref>). Last but not least, the role of context and contextual relevance in explaining scalar diversity has also been investigated, focusing either on discourse or on sentential context (<xref ref-type="bibr" rid="B41">Pankratz &amp; van Tiel, 2021</xref>; <xref ref-type="bibr" rid="B44">Ronai &amp; Xiang, 2021a</xref>; <xref ref-type="bibr" rid="B51">Simons &amp; Warren, 2018</xref>; <xref ref-type="bibr" rid="B54">Sun et al., 2023</xref>).</p>
<p>One limitation of this existing body of work that we would like to highlight is that all prior studies of scalar diversity have used exclusively written experimental stimuli, or modeled data from other studies that have done so. This presents a potential issue in light of the fact that &#8211; as we will review below in 2.2 &#8211; intonation is known to affect SI calculation. Most crucially for our purposes, certain intonational contours are also sensitive to the same factors that have been identified as predicting scalar diversity. As mentioned in Section 1 and further discussed in 2.3, one contour of interest is the RFR, which is predicted to affect the likelihood of drawing an SI. Additionally, the RFR has been argued to be felicitous with positive, but not negative, statements (in negative and positive contexts, respectively; see <xref ref-type="bibr" rid="B20">G&#246;bel, 2019</xref>; <xref ref-type="bibr" rid="B21">G&#246;bel &amp; Wagner, 2023</xref>). These two factors could, then, conspire to give the appearance that adjective polarity affects SI rates directly. As mentioned, negative scales have been found to lead to SI more robustly than positive ones (<xref ref-type="bibr" rid="B25">Gotzner et al., 2018</xref>). Such a finding, however, could in principle be an epiphenomenon arising from the RFR decreasing SI rates and negative adjectives being less likely to be silently read with an RFR. Notably, adjective polarity may be just one factor with the potential to conspire in this way, given that many aspects that constrain the use of the RFR are not yet fully understood. As a result, there is reason to believe that using auditory stimuli, carefully controlling and manipulating the intonation with which SI-triggering utterances are produced, could uncover interesting patterns that written studies on scalar diversity have obscured.</p>
</sec>
<sec>
<title>2.2 The role of intonation for SI</title>
<p>As mentioned, there are robust findings in the literature showing that intonation affects how likely SI is to arise for intonation languages like English, French, Dutch and German.<xref ref-type="fn" rid="n2">2</xref> We start by reviewing work that has examined the effect of pitch accent placement. Schwarz et al. (<xref ref-type="bibr" rid="B50">2007</xref>) investigated SI arising from the &lt;<italic>or, and</italic>&gt; lexical scale, via sentences such as (3). They varied the placement of the L+H* accent,<xref ref-type="fn" rid="n3">3</xref> which was assumed to mark prosodic focus; it occurred either on the disjunction (3a) or the auxiliary (3b).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(3)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Mary will invite Fred OR Sam to the barbecue.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Mary WILL invite Fred or Sam to the barbecue.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Having been presented with one of the above sentences, participants had to choose between two alternative interpretations: the literal meaning <italic>She will invite Fred or Sam or possibly both</italic> and the SI-enriched meaning <italic>She will invite Fred or Sam but not both</italic>. The authors found a higher rate of SI-enriched, exclusive <italic>not both</italic> interpretations when the L+H* accent was placed on <italic>or</italic> (3a). Using a truth value judgement task, Chevallier et al. (<xref ref-type="bibr" rid="B9">2008</xref>) found converging results for French, namely, that prosodic stress on <italic>or</italic> leads to an increase in the exclusive <italic>not both</italic> interpretation. Lastly, Zondervan (<xref ref-type="bibr" rid="B65">2010</xref>) conducted a similar manipulation in Dutch, contrasting sentences like those in (4). In (4a), the entire NP containing the disjunction received two H* accents (one on each disjunct), whereas in (4b), the subject received one H* accent.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(4)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Paola took AN APPLE OR A PEAR from the fruit section.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>PAOLA took an apple or a pear from the fruit section.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Participants were asked to judge the target sentence &#8211; (4a) or (4b) &#8211; as true or false, in the context of a story that made it clear that Paola had, in fact, taken both an apple and a pear. If a hearer has calculated the <italic>not both</italic> SI from the disjunction, they would, therefore, judge the target sentence to be false. In line with the other studies discussed, Zondervan (<xref ref-type="bibr" rid="B65">2010</xref>) found a significant effect such that more SIs were calculated when <italic>or</italic> was in the accented part of the sentence (4a).</p>
<p>There also exists work manipulating not the placement of the accent, but its type. Several studies in this domain have focused on ad hoc scales (<xref ref-type="bibr" rid="B30">Hirschberg, 1985</xref>) giving rise to exhaustive inferences. Gotzner (<xref ref-type="bibr" rid="B24">2019</xref>), for instance, tested sentences like (5) in German.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(5)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Context: The judge and witness followed the argument.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Critical sentence: The {judge/JUDGE} believed the defendant.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>c.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Alternative statement: The witness believed the defendant.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Participants were presented with the context sentence, followed by the critical sentence, and then had to make a truth value judgment on the alternative statement. If a participant has calculated the ad hoc inference <italic>The judge, but not the witness, believed the defendant</italic>, then they would judge the alternative statement to be false. Crucially, Gotzner (<xref ref-type="bibr" rid="B24">2019</xref>) manipulated the intonation of the critical sentence, which occurred either with an L+H* or an H* accent on the target word (<italic>judge</italic>). The findings revealed that participants computed more exhaustive inferences with an L+H* accent, as indicated by a lower % of True responses.</p>
<p>Using mouse-tracking to investigate online processing, Tomlinson et al. (<xref ref-type="bibr" rid="B55">2017</xref>) also compared what effect an L+H* vs. H* pitch accent has on ad hoc SIs in English, and found that the inference is processed earlier under the former contour. Tomlinson and Ronderos (<xref ref-type="bibr" rid="B56">2021</xref>), in turn, compared the effect of the L+H* and L*+H contours on the exhaustive interpretation arising from dialogues such as (6).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(6)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Were Manu and Moni at the party?</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B: Manu was there.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>The authors were interested in the derivation of the inference that Speaker B believes that Manu was there at the party, but Moni was not (=Speaker B believes that (&#172;Moni, Manu)). They compared B&#8217;s utterance when pronounced with the L+H* vs. L*+H contour and found that SIs were more delayed and derived at lower rates with the L+H* contour. Altogether, the studies discussed thus far provide convincing evidence that intonation affects both the likelihood and processing of SIs.</p>
<p>As mentioned above, though the effects of intonation on SI calculation are well established, work on scalar diversity has tended to use written stimuli. Nonetheless, there are two notable exceptions, that is, two studies that manipulated intonation while testing multiple different lexical scales. Cummins and Rohde (<xref ref-type="bibr" rid="B12">2015</xref>) tested 20 different English adjectival scales, and presented participants with sentences such as <italic>The view from the hotel room is pretty</italic> in two intonation conditions: neutral vs. with prosodic focus placement on the scalar adjective (here, <italic>pretty</italic>). The authors take the focus manipulation to be a manipulation of the question under discussion (QUD; <xref ref-type="bibr" rid="B43">Roberts, 2012</xref>), which they predict would influence SI rates. Indeed, they found that participants were more likely to calculate the SI (e.g., <italic>not gorgeous</italic>) in the focus condition. However, as their by-item results show (<xref ref-type="bibr" rid="B12">Cummins &amp; Rohde, 2015, p. 7, Figure 1</xref>), scalar terms differ in how susceptible they were to the intonation manipulation. There is substantial variation in effect size &#8211; i.e., in how much more likely the SI was to be calculated in the focus condition than in the neutral condition &#8211; and 6 scales, in fact, show the opposite pattern to the overall effect. This suggests that it is indeed important to study the effects of intonation on SI calculation across many scales, and to study scalar diversity with auditory stimuli. Crucially, one way in which our study differs from Cummins and Rohde (<xref ref-type="bibr" rid="B12">2015</xref>) is that we are interested in more complex intonational contours over the whole SI-triggering utterance (e.g., the RFR), rather than just manipulating whether the weaker scalar term is focused.</p>
<p>An even more relevant study for the present purposes, de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>), tested the effect of the RFR on multiple scales. Before discussing this study in more detail, however, we first want to provide sufficient background on prior research on the RFR.</p>
</sec>
<sec>
<title>2.3 The rise-fall-rise contour</title>
<p>The use of the RFR is illustrated in the naturally occurring example in (7) on the underlined sentences.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(7)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p><italic>CK</italic>:</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>If everybody knew everybody, we wouldn&#8217;t have the problems we have in the world today. <underline>You don&#8217;t rob somebody</underline> <underline>if</underline> <underline>you know their name.</underline></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p><italic>JS</italic>:</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><underline>You&#8217;re robbin&#8217; <sc>me</sc>&#8230;</underline>&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1f_rVXinuVtC37CGD8n0f4XoBgwfVk5D2/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>An early influential account of the RFR comes from Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), who propose that it conveys speaker uncertainty with respect to a scale. The authors primarily focus on the RFR in replies to questions as in (8), where its contribution can be intuitively described as a polite hedge. Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>) capture data like this by proposing that the RFR conveys uncertainty either about whether it is appropriate to evoke a scale (8a), what scale is being evoked (8b), or where a particular value falls on a given scale (8c).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(8)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Are you leaving today?</p></list-item>
<list-item><p>B: I&#8217;m not leaving <sc>today</sc>&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), (54)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Are you a doctor?</p></list-item>
<list-item><p>B: I have a <sc>PhD</sc>&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), (58)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>c.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Have you ever been West of the Mississippi?</p></list-item>
<list-item><p>B: I&#8217;ve been to <sc>Missouri</sc>&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), (62)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>As an alternative but related proposal, Constant (<xref ref-type="bibr" rid="B10">2012</xref>) draws a connection between the RFR and focus particles like <italic>only</italic>. On this view, the RFR quantifies over alternative propositions and indicates that they cannot be safely claimed by the speaker. One &#8211; highly relevant &#8211; pattern that motivates this account is that the RFR can only occur when the alternatives to the accented element do not resolve all other alternatives (are not &#8220;alternative dispelling&#8221;, in Constant&#8217;s terms), illustrated in (9). Both maximal scale elements, which either entail the falsity of all stronger alternatives (<italic>no one</italic>) or entail the truth of all weaker alternatives (<italic>all</italic>), are infelicitous, while the element that leaves alternatives open (<italic>most</italic>) is not. This pattern is captured by the assumption that in the cases of <italic>no one</italic> and <italic>all</italic>, the domain of alternatives to the asserted proposition that the RFR quantifies over is empty, and that there is a general ban on this vacuous quantification. Additionally, the contribution of the RFR is treated as a conventional implicature, by virtue of it being speaker-oriented and independent of at-issue content.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(9)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Did your friends like the movie?</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B: <sc>Most</sc> of my friends liked it&#8230;</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B: #<sc>No one</sc> liked it&#8230;</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>c.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B: #<sc>All</sc> of my friends liked it&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Constant (<xref ref-type="bibr" rid="B10">2012</xref>), (33)&#8211;(34)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Further related accounts come from Wagner (<xref ref-type="bibr" rid="B58">2012</xref>) and Wagner et al. (<xref ref-type="bibr" rid="B59">2013</xref>). Wagner differs from Constant in assuming that the RFR operates over alternative speech acts rather than propositions, and that it contributes a presupposition rather than a conventional implicature. That is, the alternatives assumed to be evoked by the RFR are not calculated relative to the focus of the sentence (e.g., {<italic>Most/None/All/&#8230;</italic>} <italic>of my friends liked it&#8230;</italic> in (9)) but, more broadly, to what else could have been said. This adjustment is meant to capture the RFR&#8217;s ability to be embedded as its own speech act that is separate from the rest of the sentence, rather than having to take scope over the whole utterance, as shown with the appositive relative clause in (10).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(10)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>John &#8211; <underline>who likes sweets</underline> &#8211; was an obvious suspect.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Wagner and colleagues focus on the incompleteness component of the RFR, stated in (11), and present experimental evidence from contexts like (12) that the RFR is produced more frequently and perceived as more acceptable in partial answers, compared to complete answers.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(11)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><bold>RFR</bold> (<italic>p</italic>): The speaker asserts <italic>p</italic> but considers it to be only an incomplete answer to the question under discussion.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(12)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><underline>Partial answer</underline></p></list-item>
<list-item><p>Q: Is either Bill or Susan coming to the party?</p></list-item>
<list-item><p>A: Bill is coming.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><underline>Complete answer</underline></p></list-item>
<list-item><p>Q: Is Bill coming to the party?</p></list-item>
<list-item><p>A: Bill is coming.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Although the previous three accounts seem closely related, they differ in a small but important detail. While all three accounts are compatible with the RFR providing an incomplete answer when the truth of other alternatives is unknown, Constant additionally allows alternatives to be unclaimable, because they are known to be false. This feature captures the fact that the RFR can be followed up with an answer that fully resolves the relevant question, as in (13), which is incompatible with Wagner&#8217;s and Wagner et al.&#8217;s accounts.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(13)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Did your friends like the movie?</p></list-item>
<list-item><p>B: <sc>John</sc> liked it&#8230; the rest of them hated it.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Constant (<xref ref-type="bibr" rid="B10">2012</xref>), (16)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>A different account comes from Westera (<xref ref-type="bibr" rid="B62">2019</xref>), which can be viewed as elaborating on the relevance of the QUD, highlighted by Wagner et al. (<xref ref-type="bibr" rid="B59">2013</xref>). Embedded in a Gricean theory of pragmatics (for more details on the framework, see <xref ref-type="bibr" rid="B61">Westera, 2017</xref>), Westera proposes that the RFR &#8211; assumed to also cover cases of Contrastive Topic (<xref ref-type="bibr" rid="B8">B&#252;ring, 2003</xref>) &#8211; conveys information about whether a conversational maxim is being violated or adhered to, and what a speaker takes the available QUDs to be. More specifically, by using an RFR, the speaker is taken to indicate that there are at least two QUDs that are being addressed, and that with respect to one QUD, a maxim is being violated (or suspended), while for another QUD, a maxim is complied with. To illustrate this account with the case in (9), we can assume that the main QUD is the explicitly provided question (<italic>Did your friends like the movie?</italic>), and using the RFR conveys that the answer given does not fully resolve the question, thereby suspending the Maxim of Quantity. As a result, both <italic>no one</italic> and <italic>all</italic> are infelicitous, because they fully resolve the question, and consequently, no maxim is violated, contrary to what the RFR is assumed to convey. A possible secondary QUD for this case could be something along the lines of <italic>Do you think I should go see the movie?</italic> &#8211; this is ultimately left to pragmatic reasoning.</p>
<p>Lastly, G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>) shift their attention to the function of the RFR in argumentative dialogues. The observation they make is that the RFR exhibits an asymmetry in replies to statements, depending on the polarity of the initial statement, which they dub <italic>valence asymmetry</italic>. While the RFR is felicitous when providing a positive counterpoint to a negative statement (14a), it is degraded when the order is reversed and its carrier utterance provides a negative counterpoint to a positive statement (14b).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(14)</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: The bike ride yesterday was really terrible, the weather was horrific.</p></list-item>
<list-item><p>B: We had a cocktail&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1p1bRN_I7agJ5Q-mClEyWkUUG6jNdj5SR/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: The bike ride yesterday was really great, the weather was perfect.</p></list-item>
<list-item><p>B: #We had an accident&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1F4ogxBom1VCIEajXUwaxVNuTNNsOEGGU/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Notably, this pattern is unexplained by previous accounts, since B&#8217;s replies in (14a) and (14b) do not differ in whether alternatives are left open or not. The authors, hence, propose that the RFR conveys the presence of a stronger alternative on a pragmatically inferred scale. For cases like (14), this scale concerns an evaluation, here, of the quality of the bike ride, where the positive reply implies a stronger &#8211; or better &#8211; alternative to A&#8217;s statement, whereas the negative reply implies a weaker &#8211; or worse &#8211; one. For cases like (9), on the other hand, the scale is one of logical entailment, such that a stronger alternative to <italic>most</italic> would be <italic>all</italic>, capturing the pattern in a similar way as previous accounts.</p>
<p>This account of the RFR and the case of the valence asymmetry directly connect to the notion of adjective polarity in the scalar diversity literature, as mentioned earlier. As an illustration, consider the case of the scale &lt;<italic>ugly, hideous</italic>&gt;, categorized as a pair with negative polarity by Gotzner et al. (<xref ref-type="bibr" rid="B25">2018</xref>). On the assumption that <italic>ugly</italic> and <italic>hideous</italic> are on a measurement scale regarding beauty with other adjectives like <italic>pretty</italic> and <italic>beautiful</italic>, the stronger predicate <italic>hideous</italic> would actually be lower than <italic>ugly</italic> on the scale (see <xref ref-type="bibr" rid="B52">Solt, 2015</xref>). As a result, the RFR would be predicted to be unacceptable by G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>). This prediction is borne out intuitively for the item in (15).<xref ref-type="fn" rid="n4">4</xref> We would, therefore, expect the RFR to occur less frequently with negative scales than positive scales, and to contribute to the appearance of a polarity effect in scalar diversity, assuming the RFR has its own independent effect on SI rates.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(15)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Is the wallpaper hideous?</p></list-item>
<list-item><p>B: ??It is ugly&#8230;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1osKyilhpK7VVeYp63sSsph_qYVfqmBEH/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Regarding SI rates, the accounts of the RFR differ in their predictions about its effect on the likelihood of drawing an SI. Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), Wagner (<xref ref-type="bibr" rid="B58">2012</xref>) and Wagner et al. (<xref ref-type="bibr" rid="B59">2013</xref>) can all be argued to predict a decrease in SI rate for the RFR, relative to a Fall: drawing an SI relies on negating a stronger alternative, which is incompatible with having to leave the truth of said alternative open, as required by these accounts.<xref ref-type="fn" rid="n5">5</xref> In contrast, G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>) simply treat the RFR as implying the existence of a stronger alternative, while remaining agnostic regarding its truth value. A possible effect of the RFR could, then, be that highlighting the salience of the relevant alternative leads to an increase in SI rate. The idea that the salience of alternatives can affect SI rate in this way is supported by findings from the written domain from Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>; see also <xref ref-type="bibr" rid="B45">Ronai &amp; Xiang, 2021b</xref>; <xref ref-type="bibr" rid="B64">Yang et al., 2018</xref>; <xref ref-type="bibr" rid="B66">Zondervan et al., 2008</xref>), who found that a prior question that mentions the stronger alternative leads to an increase in SI rate, relative to when the SI-triggering sentence occurs without a question context, or following a question that mentions the weaker scalar term itself. An intermediate position is taken by Constant (<xref ref-type="bibr" rid="B10">2012</xref>), whose account is compatible with either an increase or a decrease, given that alternatives can be unclaimable either because they are considered false (SI increase) or because they are not known (SI decrease). Similarly, Westera&#8217;s (<xref ref-type="bibr" rid="B62">2019</xref>) account allows some flexibility, such that the exact predictions with respect to SI rate are less definitive. On the one hand, the account includes a prediction that the RFR may not exhaustively resolve the main QUD, which would result in a decrease in SI rate. On the other hand, in the cases we are concerned with, the RFR may also pick up on a secondary QUD &#8211; so the explicitly mentioned question may no longer be the main QUD, and the RFR may be compatible with an increase in SI rate.</p>
</sec>
<sec>
<title>2.4 Previous work on the effect of RFR on SI calculation</title>
<p>While the relation between the RFR and SIs has not been the main concern of the accounts discussed above, there exist two notable studies that have looked at this relation. The first is de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>), mentioned above, who tested dialogues such as (16), where the reply contains a weaker scalar term, and the question, a stronger alternative.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(16)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Mike</italic>: Was your hike exhausting?</p></list-item>
<list-item><p><italic>Julie</italic>: It was strenuous.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>The authors manipulated whether Julie&#8217;s answer was pronounced with a neutral fall (H* L-L%) or an RFR (L*+H L-H%), and asked participants to indicate whether Julie &#8220;mean[s] that her hike was exhausting&#8221; on a 7-point scale (from <italic>definitely no</italic> to <italic>definitely yes</italic>). The results showed that the RFR led to fewer positive responses than a neutral fall, suggesting that it increased the likelihood of drawing a SI. However, while the experiment tested 16 different adjectival scales, the authors&#8217; main focus was not on by-scale variation, leaving open the question of how (or whether) the intonation manipulation interacted with scalar diversity.</p>
<p>Moreover, the conclusion drawn by de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>) has been questioned by Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear</xref>), the second study on the RFR and SIs to be discussed here. Crucial to their criticism is the distinction between SIs and ignorance inferences: rather than taking the assertion of a weaker alternative as evidence that the speaker believes the stronger alternative to be false, it can also be understood as indicating that the speaker is simply ignorant regarding its truth value. Crucially, replying <italic>no</italic> to the questions posed by de Marneffe and Tonhauser is compatible with either type of inference. Thus, the results can be given an interpretation other than the RFR increasing the likelihood of SI calculation.</p>
<p>To address the mapping of intonation onto pragmatic inferences directly, Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear</xref>) tested which type of inference &#8211; ignorance inference or SI &#8211; participants would be more likely to draw after hearing a target sentence either with Fall or RFR, as illustrated in (17).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(17)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>A: Did all of the guests eat dinner?</p></list-item>
<list-item><p>B: <sc>Some</sc> of them ate dinner.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B thinks that not all of the guests ate dinner.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SI</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>B isn&#8217;t sure whether or not all of the guests ate dinner.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ignorance inference</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>In their first experiment, participants were given one contour and had to choose one of the inference options. Both contours led to SI choices overwhelmingly, without a significant difference between them (although the preference was numerically larger for Fall). The second experiment then gave participants both intonation versions at once and asked them to choose between mapping Fall to SI and RFR to ignorance inference, or Fall to ignorance inference and RFR to SI. Under those conditions, participants showed a preference for the former option, which the authors took as evidence that the RFR conveys uncertainty. While this study did not test the effect of the RFR on the likelihood of SI calculation directly (only via comparison to ignorance inferences) and was restricted to the &lt;<italic>some, all</italic>&gt; scale, it does add to the body of evidence demonstrating that SI-related interpretations are sensitive to intonational cues.</p>
<p>With this background, we now turn to our own experimental investigation of the role of intonation for scalar diversity.</p>
</sec>
</sec>
<sec>
<title>3. Experiment 1: Production + inference task</title>
<p>In Experiment 1, we investigate the effect of intonation on SI calculation &#8211; in the context of scalar diversity &#8211; by combining a production task with an inference task. This allows us to see what intonational contours participants produce for various potentially SI-triggering sentences, and whether they calculate SI (as measured by the inference task), given a certain contour.</p>
<sec>
<title>3.1 Method, materials &amp; design</title>
<p>The stimuli in Experiment 1 consisted of question-answer dialogues containing scalar terms, as in (18). The dialogues featured the following experimental manipulation: the question prompt (Emma&#8217;s question) and the target sentence (the participant&#8217;s reply) either contained the same weaker scalar term (18a), or the question contained the chosen stronger alternative (18b). There were 60 lexical scales taken from Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>), in addition to 20 fillers. The <sc>same</sc> vs. <sc>strong</sc> manipulation was administered within-participants, i.e., each participant saw each item only in one condition in a Latin Square design.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(18)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Sample Item, Experiment 1</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Emma</italic>: Was the winner happy?&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<sc>same</sc></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Emma</italic>: Was the winner ecstatic?&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<sc>strong</sc></p></list-item>
<list-item><p><italic>You</italic>: She was happy.</p></list-item>
<list-item><p><italic>Given your response, do you think Emma would conclude that the winner was not ecstatic?</italic></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Participants first saw the full dialogue on the screen. After pressing a button, they heard an audio recording of the question (<italic>Was the winner happy?</italic> or <italic>Was the winner ecstatic?</italic>) and had to record themselves saying the reply (<italic>She was happy</italic>.). Emma&#8217;s questions were presented auditorily, in addition to in written form, in order to make the task more natural. Afterwards, they were given the task question <italic>Given your response, do you think&#8230;?</italic> (italicized in (18)) and chose between &#8220;Yes&#8221; and &#8220;No&#8221; as their answer. In this adapted version of the inference task from van Tiel et al. (<xref ref-type="bibr" rid="B57">2016</xref>) (see also, i.a., <xref ref-type="bibr" rid="B41">Pankratz &amp; van Tiel, 2021</xref>), if a participant responds with &#8220;Yes&#8221;, that can be taken to indicate SI calculation: that the participant has enriched <italic>happy</italic> to <italic>not ecstatic</italic>. Responding with &#8220;No&#8221;, on the other hand, suggests that the participant has not calculated the SI and takes <italic>happy</italic> to be compatible with <italic>ecstatic</italic>.<xref ref-type="fn" rid="n6">6</xref> Altogether, this method allowed us to gather data on the production rates of relevant contours on the target sentence across conditions and items, as well as examine SI rates in light of a given contour being produced.</p>
<p>Recordings were manually annotated by the second author in terms of the overall contour used by the participant on a given item. The annotator listened to target sentences in Praat. Sentences were presented without the context sentence or knowledge of condition in order to avoid bias. Contours were categorized according to a combination of the visual pitch information available and the auditory impression, given that audio quality was not always sufficient to guarantee accurate pitch tracking. The &#8220;a priori&#8221; categories originally included five contours:</p>
<list list-type="roman-lower">
<list-item><p>a pitch accent on the scalar item, e.g., <italic>happy</italic> in (18), and a monotonous final fall &#8211; (L+)H* L-L% in ToBI labels (=Fall),</p></list-item>
<list-item><p>a rising pitch accent on the scalar item, followed by a low phrasal accent and a rising final boundary tone &#8211; {L*+H/L+H*} L-H% (= RFR),</p></list-item>
<list-item><p>a pitch accent on the auxiliary (if present), which we take to indicate Verum Focus (see <xref ref-type="bibr" rid="B31">H&#246;hle, 1992</xref>; <xref ref-type="bibr" rid="B40">Lohnstein, 2015</xref>; as well as 3.5 below), followed by a monotonous final fall (= Verum Focus Fall, following <xref ref-type="bibr" rid="B22">Goodhue et al., 2016</xref>),</p></list-item>
<list-item><p>a monotonous final rise without a preceding rising accent &#8211; L* H-H% (= Rising Declarative, see, e.g., <xref ref-type="bibr" rid="B28">Gunlogson, 2001</xref>),</p></list-item>
<list-item><p>Other/Unclear.</p></list-item>
</list>
<p>However, after initial inspection, two changes were made. First, Rising Declaratives were taken out, due to the contour not occurring sufficiently frequently. Second, as mentioned in Section 1, there was a notably frequent use of a contour with an initial and a final high tone that we labeled &#8220;Concession Contour&#8221;, illustrated in (19), so it was added as one of the categories.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(19)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>(A: Was the winner happy/ecstatic?)</p></list-item>
<list-item><p>B: She was happy.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1dbzhmacH2TWyGRa17HsJTJrYLBS4awNk/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
</sec>
<sec>
<title>3.2 Procedure</title>
<p>The experiment was implemented through prosodyExperimenter (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/prosodylab/prosodylabExperimenter">https://github.com/prosodylab/prosodylabExperimenter</ext-link>). Participants first saw a welcome screen, followed by a chance to adjust their volume and test their microphone, an online consent form, and a language background questionnaire. Afterwards, there was a test in which participants were played three sounds and had to choose which one was the quietest, which required the use of headphones. For the main part of the experiment, participants provided their production of the target sentence and then answered the question for the inference task, as described above. There were three practice trials after the instructions were received, followed by a total of 80 stimuli. The experiment concluded with a chance to provide feedback. A test version of the experiment can be accessed at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://prosodylab.org/~agobel/conepi/30-scaRFR_Pro2AFC/?SESSION_ID=Glossa&amp;mode=experiment">https://prosodylab.org/~agobel/conepi/30-scaRFR_Pro2AFC/?SESSION_ID=Glossa&amp;mode=experiment</ext-link>.</p>
</sec>
<sec>
<title>3.3 Participants</title>
<p>64 monolingual native speakers of American English were recruited on Prolific and compensated $4 or $5 (depending on time). One participant&#8217;s response file was not properly saved and, hence, not annotated. During annotation, participants were excluded if their responses were unnatural (N = 5), or if they were monotonous across items, in that they almost exclusively chose either Fall or Verum Focus Fall (N = 21).<xref ref-type="fn" rid="n7">7</xref> We were, thus, left with 37 participants for the data analysis reported below.</p>
</sec>
<sec>
<title>3.4 Results</title>
<sec>
<title>3.4.1 Production rates</title>
<p>The counts by condition for each category are shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The first thing to note is that Fall is by far most frequent contour used, comprising about 56% of the total recordings, even after excluding monotonous participants. Next, we can see that the RFR was used almost exclusively in the <sc>strong</sc> condition. The Concession Contour, on the other hand, trended toward occurring more frequently in the <sc>same</sc> condition, but was more evenly distributed. Finally, Verum Focus Fall occurred almost exclusively in the <sc>same</sc> condition. We discuss the implications of these findings below in 3.5.</p>
<fig id="F1">
<caption>
<p><bold>Figure 1:</bold> Production rates by contour and condition in Experiment 1. Lighter colors (left) correspond to the <sc>same</sc> condition, and darker colors (right), to the <sc>strong</sc> condition.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g1.png"/>
</fig>
</sec>
<sec>
<title>3.4.2 SI rates</title>
<p>We next looked at the rate of SI calculation, as measured by the inference task, depending on the contour produced by the participant. We restricted this analysis to Fall as a baseline, RFR as the intended contour of interest, and the Concession Contour for exploratory purposes. SI rates, i.e., the proportion of &#8220;Yes&#8221; responses, for those three contours by condition are shown in <xref ref-type="fig" rid="F2">Figure 2</xref>.<xref ref-type="fn" rid="n8">8</xref> For the statistical analysis, we fit a logistic mixed effects regression model using the lme4 package in R (<xref ref-type="bibr" rid="B4">Bates et al., 2015</xref>). The model predicted Response in the inference task (&#8220;Yes&#8221; vs. &#8220;No&#8221;) as a function of Contour (RFR vs. Fall vs. Concession Contour), Condition (<sc>same</sc> vs. <sc>strong</sc>) and their interaction. It included the maximal random effects structure supported by the data (<xref ref-type="bibr" rid="B3">Barr et al., 2013</xref>): random by-participant and by-item intercepts and slopes for the Condition predictor. Both fixed effects predictors were treatment-coded: in Contour, the Fall level served as baseline, while in Condition, the <sc>strong</sc> level served as baseline.</p>
<fig id="F2">
<caption>
<p><bold>Figure 2:</bold> Mean SI rates (and SE) by contour and condition in Experiment 1. Lighter colors (left) correspond to the <sc>same</sc> condition, and darker colors (right), to the <sc>strong</sc> condition. Circles denote individual participants.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g2.png"/>
</fig>
<p>The analysis revealed the following results. First, the <sc>same</sc> condition produced lower SI rates than the <sc>strong</sc> condition (Estimate = &#8211;1.13, SE = 0.27, <italic>z</italic> = &#8211;4.14, <italic>p</italic> &lt; 0.001). There was no evidence that this effect differed across contours, i.e., there were no significant interactions (Estimate = &#8211;0.58, SE = 0.59, <italic>z</italic> = &#8211;0.98, <italic>p</italic> = 0.33; Estimate = 0.04, SE = 0.4, <italic>z</italic> = 0.11, <italic>p</italic> = 0.92). Second, Fall showed the lowest SI rate (33.5% in the <sc>same</sc> condition and 45.3% in the <sc>strong</sc> condition), followed by the Concession Contour (48.3% in the <sc>same</sc> condition and 61.6% in the <sc>strong</sc> condition), which produced a significantly higher rate (Estimate = 0.7, SE = 0.31, <italic>z</italic> = 2.25, <italic>p</italic> &lt; 0.05). Lastly, the RFR produced the highest SI rate (55.2% in the <sc>same</sc> condition and 70% in the <sc>strong</sc> condition), also significantly higher than the baseline Fall (Estimate = 0.89, SE = 0.23, <italic>z</italic> = 3.81, <italic>p</italic> &lt; 0.001).</p>
</sec>
</sec>
<sec>
<title>3.5 Discussion</title>
<sec>
<title>3.5.1 Production rates</title>
<p>The experiment provided data from two sources: production rates of contours and inference rates given the production of a certain contour. For production rates, there are four findings to mention. First, participants&#8217; primary choice of contour was a Fall, which made up slightly over half of all productions. We attribute this overwhelming preference to the online setting, with participants sitting by themselves in front of a computer, which may make it difficult to voice-act fully naturally. Additionally, the annotation focused solely on the overall pitch contour and did not take into account other acoustic factors, such as duration or intensity, which is worth investigating in future studies (see, for example, <xref ref-type="bibr" rid="B49">Sandberg &amp; Cole, 2022</xref>).</p>
<p>Second, we saw that Verum Focus Fall occurred exclusively in the <sc>same</sc> condition, where the scalar terms in the question prompt (Emma&#8217;s question) and the target sentence (the participant&#8217;s reply) were identical. This finding serves as a sanity check, since in the <sc>same</sc> condition, the scalar term in the reply (<italic>happy</italic>) is given, and accenting a given word is usually marked. Shifting prominence to the auxiliary prevents such a violation. However, not all items allowed for this pattern, since not all target sentences included auxiliaries (e.g., <italic>Did the train slow? It slowed</italic>.). This explains why the rate of Verum Focus Fall is not as high as one would expect.</p>
<p>Third, the RFR almost exclusively occurred in the <sc>strong</sc> condition, where the question prompt mentioned a stronger alternative. This is in line with G&#246;bel and Wagner&#8217;s (<xref ref-type="bibr" rid="B21">2023</xref>) account of the RFR, which takes it to convey the presence of a stronger alternative. In the <sc>strong</sc> condition, the requirement for a stronger alternative to be present is explicitly satisfied by the question prompt (<italic>Was the winner <bold>ecstatic</bold>?</italic>). In Section 5, we elaborate more on how other theoretical accounts might capture the observed <sc>strong-same</sc> asymmetry.</p>
<p>Finally, the experiment revealed the frequent use of a contour that was not previously considered as a relevant option for the given contexts, which we refer to as the Concession Contour, illustrated in (20) (repeated from (19)) with one of the productions elicited in the experiment. Its prosodic characteristics are an initial high tone followed by a fall up until a concluding rise. This pitch shape exactly parallels that of the so-called Contradiction Contour (<xref ref-type="bibr" rid="B39">Liberman &amp; Sag, 1974</xref>), illustrated in (21). However, intuitively the two contours seem to make different contributions, with the reply in (20) sounding much less like a proper contradiction. We will also return to this issue in Section 5.</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(20)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>(A: Was the winner happy/ecstatic?)</p></list-item>
<list-item><p>B: She was happy.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1dbzhmacH2TWyGRa17HsJTJrYLBS4awNk/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(21)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>JS</italic>: <underline>These balloons aren&#8217;t</underline> <underline>gonna stay</underline> <underline>filled &#8216;til New Year&#8217;s!</underline></p></list-item>
<list-item><p><italic>CK</italic>: <underline>Those aren&#8217;t for New Year&#8217;s!</underline> Those are my everyday balloons.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;(<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/14wwIDq0yP1UuTz5-kFzDQSbpYKxFzhJp/view?usp=sharing"><sc>audio</sc></ext-link>)</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
</sec>
<sec>
<title>3.5.2 SI rates</title>
<p>Moving on to SI rates, Experiment 1 had three main findings. First, SI rates were higher in the <sc>strong</sc> condition than in the <sc>same</sc> condition. This replicates Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>), who conducted the same manipulation using written stimuli. The <sc>strong-same</sc> difference can be explained in at least two ways. In the <sc>strong</sc> condition, the question directly mentions the relevant stronger alternative, thereby increasing its salience and encouraging hearers to reason about it. Additionally, in that condition, only on its SI-enriched meaning does the answer constitute a congruent one, in the sense of Gualmini et al. (<xref ref-type="bibr" rid="B27">2008</xref>), Hulsey et al. (<xref ref-type="bibr" rid="B35">2004</xref>), and Zondervan et al. (<xref ref-type="bibr" rid="B66">2008</xref>). Since this finding is not of primary interest to the current study, we direct the reader to Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>) for further discussion, as well as to, i.a., Degen (<xref ref-type="bibr" rid="B14">2013</xref>), Degen and Tanenhaus (<xref ref-type="bibr" rid="B15">2015</xref>), Kursat and Degen (<xref ref-type="bibr" rid="B37">2020</xref>), and Ronai and Xiang (<xref ref-type="bibr" rid="B45">2021b</xref>) for findings regarding the context-sensitivity of SI.</p>
<p>More crucially, we also found that the RFR led to an increase in SI rates, relative to a Fall. As discussed in 2.3, this pattern is unexpected based on several theoretical accounts of the RFR: those that take the contour to correspond to the alternative being left open (<xref ref-type="bibr" rid="B58">Wagner, 2012</xref>; <xref ref-type="bibr" rid="B59">Wagner et al., 2013</xref>; <xref ref-type="bibr" rid="B60">Ward &amp; Hirschberg, 1985</xref>). It is, however, directly in line with predictions of G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>). As for previous experimental results, de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>) had similarly found the RFR to result in increased SI rates, while Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear</xref>) found the opposite effect. We discuss how our findings can be reconciled with the latter study in detail in Section 5.</p>
<p>Lastly, Experiment 1 also found the novel Concession Contour to yield a higher SI rate than a Fall, but less so than the RFR, a full discussion of which we will come back to later.</p>
</sec>
<sec>
<title>3.5.3 Relation to scalar diversity</title>
<p>While the effect of the RFR on SI rates constitutes an interesting finding for theoretical accounts of the contour, further analyses are needed to more precisely determine how intonation interacts with the phenomenon of scalar diversity. For instance, one important possibility is that those lexical scales that show a high SI rate in written studies might also happen to be the ones produced more often with an RFR in our study, such that RFR rate is a direct correlate of SI rate. If this is the case, then the RFR does not actually encourage SI calculation, and its effect of leading to increased SI rates in Experiment 1 arises, instead, as an epiphenomenon. This hypothetical, whereby the RFR co-occurs with scales that robustly lead to SI, could also receive a different interpretation. Namely, since previous studies of scalar diversity relied on written stimuli, it is conceivable that certain scales were found to lead to higher SI rates than others due to their propensity to be silently read with an RFR intonation. On this view, finding that RFR rates and SI rates are linked would suggest that written studies of scalar diversity suffered from a confound. To investigate these possibilities, we conducted an additional correlational analysis. <xref ref-type="fig" rid="F3">Figure 3</xref> shows the by-item (that is, by-scale) correlation between RFR productions in the <sc>strong</sc> condition of Experiment 1 and SI rates from Ronai and Xiang&#8217;s (<xref ref-type="bibr" rid="B47">2024</xref>) written study, which conducted the same dialogue manipulation on the same stimuli as we did.</p>
<fig id="F3">
<caption>
<p><bold>Figure 3:</bold> By-item correlation between RFR productions in Experiment 1 and SI rates from Ronai and Xiang&#8217;s (<xref ref-type="bibr" rid="B47">2024</xref>) written study (<sc>strong</sc> condition).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g3.png"/>
</fig>
<p>We find a moderate positive correlation (Pearson&#8217;s correlation test: r = 0.45), indicating that the RFR was indeed produced more frequently with scales that are more likely to lead to SI.<xref ref-type="fn" rid="n9">9</xref> This can be interpreted as suggestive evidence that the RFR&#8217;s effect on SI rates is indeed related to its co-occurrence with lexical scales that are more likely to lead to SI. Or, alternatively, that a scale&#8217;s likelihood of triggering SI calculation is linked (in part) to its propensity to be silently read with RFR. Before drawing any firmer conclusions, however, we will further investigate these possibilities in Experiment 2, which uses a perception task that allows us to assess the contribution of intonation independently of lexical factors.</p>
<p>In 2.1, we raised the possibility that the RFR may interact with predictors of scalar diversity in ways that constitute possible confounds, unless properly controlled for. One such factor is polarity. G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>) suggest that the RFR is only felicitous with positive statements, while Gotzner et al. (<xref ref-type="bibr" rid="B25">2018</xref>) have found higher SI rates with negative adjectival scales than with positive ones. On theories of the RFR that predict it to lower SI rates, e.g., because it indexes uncertainty, this would open up the possibility that what results in the polarity effect in scalar diversity is that positive scales are more likely to be silently read with the RFR. In Experiment 1, we found the RFR to increase SI rates, not decrease them, which suggests that the above confound is not at play. Nonetheless, it is worth briefly looking at the effect of polarity on both RFR productions and SI rates. To do this, we analyzed the subset of our lexical scales that had been annotated for polarity by Gotzner et al. (<xref ref-type="bibr" rid="B25">2018</xref>) (see their 2.1.2.4. for details). This included 21 adjectival scales that are fully identical across the two studies, as well as &lt;<italic>pretty, beautiful</italic>&gt;, where we adopted Gotzner et al.&#8217;s annotation for &lt;<italic>pretty, gorgeous</italic>&gt;. Of these 22 scales, the RFR was produced 52 times with positive scales (4.33 average) and 19 times with negative scales (1.9 average). That is, the RFR occurred more than twice as frequently with positive scales, in line with the predictions of G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>). To check the effect of polarity on SI rates, a logistic mixed effects model was fit, predicting Response (&#8220;Yes&#8221; vs. &#8220;No&#8221;) by Polarity (treatment-coded, with <sc>negative</sc> serving as baseline). The model included random intercepts for participants and items. This analysis revealed that <sc>positive</sc> and <sc>negative</sc> scales did not differ significantly from each other in their likelihood of leading to SI (Estimate = 0.12, SE = 0.86, <italic>z</italic> = 0.14, <italic>p</italic> = 0.89).<xref ref-type="fn" rid="n10">10</xref> This means that Experiment 1 ultimately did not reveal evidence supporting the hypothetical conspiracy of the RFR&#8217;s polarity asymmetry and effect on SI rates: we actually found that the RFR increases SI rates, and our set of items happened to include adjectival scales that do not show a polarity effect. But it remains the case that factors governing intonational contours overlap with those predicting SI rates, and future work should, therefore, still keep this potential interaction in mind.</p>
</sec>
</sec>
</sec>
<sec>
<title>4. Experiment 2: Perception + inference task</title>
<p>To address open questions from Experiment 1, Experiment 2 focuses on the independent effect of intonation on SI rates. For this, we combine a perception task with the inference task: participants listen to potentially SI-triggering sentences in different intonational conditions before making an SI judgment.</p>
<sec>
<title>4.1 Method, materials &amp; design</title>
<p>We used the same materials as in Experiment 1 (60 experimental stimuli + 20 fillers), but restricted to the <sc>strong</sc> condition, since the RFR was rarely produced in the <sc>same</sc> condition. Additionally, both the question prompt (Emma&#8217;s question) and the target sentence (now Luke&#8217;s reply) were presented auditorily, without the text being visible on the screen. The target sentence occurred with one of three contours: a <sc>Fall</sc>, the <sc>RFR</sc>, or the <sc>Concession Contour</sc>, in a Latin Square design. After listening to one version of the dialogue, participants were asked the same task question (italicized in (22)) as in Experiment 1 &#8211; with the only modification being that the target speaker was no longer referred to as <italic>you</italic> but as <italic>Luke</italic>, i.e., the task question included <italic>Given Luke&#8217;s response&#8230;</italic>. As before, we take a &#8220;Yes&#8221; response to index SI calculation, and a &#8220;No&#8221; response to index that the participant has not calculated the SI. A sample item with recordings is shown in (22), and pitch tracks for the three contours are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4">
<caption>
<p><bold>Figure 4:</bold> Pitch tracks for the target sentence <italic>It was serious</italic>, from the &lt;<italic>serious, life-threatening</italic>&gt; scale, with <sc>Fall</sc> in black (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/12MTtz2H5BNsbPnAV5juXoherIq5dzdBW/view?usp=sharing"><sc>audio</sc></ext-link>), <sc>RFR</sc> in orange (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1qHEpM6mzsel8ONYyZuT2ntTf3Ncxxvam/view?usp=sharing"><sc>audio</sc></ext-link>) and <sc>Concession Contour</sc> in purple (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/17bPQYnJjzZcaY3iOS2OlxnwLCrP8z2oZ/view?usp=sharing"><sc>audio</sc></ext-link>).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g4.png"/>
</fig>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(22)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Sample Item, Experiment 2</p></list-item>
<list-item><p><italic>Emma</italic>: Was the winner ecstatic?</p></list-item>
<list-item><p><italic>Luke</italic>: She was happy. {[<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1ajH-9yMQFntc9N9LmGOa93H5iuOI3eCR/view?usp=sharing"><sc>fall</sc></ext-link>], [<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1Z5c0t6R6ST9UgAJOYVLggF7r6WMX4wqW/view?usp=sharing"><sc>rfr</sc></ext-link>], [<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://drive.google.com/file/d/1aN5HZuOYTE1tg0WmtLShM-iQ2Uv47NSb/view?usp=sharing"><sc>concession</sc></ext-link>]}</p></list-item>
<list-item><p>&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<italic>Given Luke&#8217;s response, do you think Emma would conclude that the winner was not ecstatic?</italic></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
</sec>
<sec>
<title>4.2 Procedure</title>
<p>The general procedure was largely the same as for Experiment 1, except there was no mic check. A test version can be accessed at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://prosodylab.org/~agobel/conepi/31-scaRFR_Aud2AFC/?SESSION_ID=Glossa&amp;mode=experiment">https://prosodylab.org/~agobel/conepi/31-scaRFR_Aud2AFC/?SESSION_ID=Glossa&amp;mode=experiment</ext-link>.</p>
</sec>
<sec>
<title>4.3 Participants</title>
<p>90 monolingual native speakers of American English were recruited on Prolific and compensated $2.50. 17 participants were excluded for failing the headphone test. Data from the remaining 73 participants is reported below.</p>
</sec>
<sec>
<title>4.4 Results</title>
<p>SI rates &#8211; that is, the proportion of &#8220;Yes&#8221; responses &#8211; by contour are shown in <xref ref-type="fig" rid="F5">Figure 5</xref>. To analyze the results, we fit a logistic mixed effects regression model predicting Response (&#8220;Yes&#8221; vs. &#8220;No&#8221;) as a function of Contour (Fall vs. RFR vs. Concession Contour). The fixed effects predictor was treatment-coded, with Fall as the reference level. The maximal converging random effects structure included by-participant intercepts and by-item intercepts and slopes. We found significantly higher rates of SI calculation with the RFR than with the Fall (Estimate = 0.4, SE = 0.12, <italic>z</italic> = 3.25, <italic>p</italic> &lt; 0.01). The difference between Fall and Concession Contour, on the other hand, was not significant (Estimate = 0.04, SE = 0.12, <italic>z</italic> = 0.39, <italic>p</italic> = 0.70).</p>
<fig id="F5">
<caption>
<p><bold>Figure 5:</bold> SI rates (and SE) by contour in Experiment 2. Circles correspond to individual participants.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g5.png"/>
</fig>
</sec>
<sec>
<title>4.5 Discussion</title>
<p>The results largely replicated the findings from Experiment 1. Fall received the lowest SI rate (54.5%), RFR the highest (62.5%), and the Concession Contour was numerically in between the two (57.4%). However, the differences were much smaller than in Experiment 1, such that only the comparison between Fall and RFR reached statistical significance. This compression may be due to the more mediated nature of the task: rather than judging one&#8217;s own production &#8211; and by virtue of that, most likely intention &#8211; the perception experiment required not only reasoning about the intention of someone else&#8217;s choice of intonation, but also how that might affect the hearer. The fact that the experiment was able to replicate the previous data is thus even more notable. As before, the RFR leading to an increase in SI rates supports theoretical accounts where it is analyzed as indicating the presence of a stronger alternative (<xref ref-type="bibr" rid="B20">G&#246;bel, 2019</xref>; <xref ref-type="bibr" rid="B21">G&#246;bel &amp; Wagner, 2023</xref>), while it is less compatible with those that take the contour to correspond to uncertainty or the alternative being left open (<xref ref-type="bibr" rid="B58">Wagner, 2012</xref>; <xref ref-type="bibr" rid="B59">Wagner et al., 2013</xref>; <xref ref-type="bibr" rid="B60">Ward &amp; Hirschberg, 1985</xref>).</p>
<p>In our discussion of Experiment 1, we raised the possibility that the RFR increasing SI rates is epiphenomenal, i.e., arising as a consequence of it occurring more frequently with items that are more likely to lead to SI to begin with. Experiment 2 was specifically conducted to further probe this possibility by directly manipulating the intonational contour of all items. Since Experiment 2 found the same effect of the RFR as Experiment 1, but this effect now cannot be reduced to by-scale variation in SI rates, as the RFR vs. Fall contrast occurred with all items, we can conclude that the RFR indeed encourages SI calculation.</p>
<p>As mentioned, a different interpretation of the (moderate) by-scale correlation between RFR productions and SI rates is also available. Namely, it could be the case that the reason why certain scales were found to produce higher rates of SI than others in written studies of scalar diversity is that such scales are the ones more likely to be silently read with the RFR. The current Experiment 2 data is also informative with respect to this possibility, as we can check whether the different lexical scales&#8217; relative likelihood of leading to SI remained consistent across different intonational contours. To do this, we calculated rank-order correlations using Kendall&#8217;s <italic>&#964;<sub>B</sub></italic>.<xref ref-type="fn" rid="n11">11</xref> This analysis finds that SI rates with the RFR are correlated with both the Fall (<italic>&#964;<sub>B</sub></italic> = 0.63) and the Concession Contour (<italic>&#964;<sub>B</sub></italic> = 0.67), showing that the relative order of different lexical scales remains largely (though not entirely) the same. This, in turn, suggests that what makes a lexical scale a &#8220;high SI rate&#8221; scale is not simply its propensity to be silently read with RFR in a written study.</p>
<p>The next section turns to further discussion of what the results of Experiments 1 and 2 tell us about SIs and scalar diversity, as well as intonational contours.</p>
</sec>
</sec>
<sec>
<title>5. General discussion</title>
<p>This article presented data from two experiments investigating the role of intonation for scalar diversity. Experiment 1 used the combination of production and an inference task to assess, first, what contours speakers use in dialogues involving scalar terms, and second, how the choice of contour affects the likelihood of drawing an SI across different scales. We saw that &#8211; despite a strong overall preference for Fall, likely due to the online setting &#8211; participants frequently used the RFR, as expected, in addition to the unexpected use of a contour we labeled Concession Contour. Crucially, both the RFR and the Concession Contour led to an increase in SI rates relative to a Fall, with the RFR&#8217;s effect being stronger. Moreover, the rate at which participants produced each of these contours was not uniform, but varied by lexical scale. To ensure that the effect of contour on SI rates was not actually driven by this variation, Experiment 2 presented participants with recordings of the same target sentences, manipulating the type of contour. The results from this experiment replicated the main overall pattern, with both the RFR and the Concession Contour numerically increasing the likelihood of SI, relative to Fall, although only the comparison between RFR and Fall was significant. We now turn to discussing how the combined findings inform the study of scalar diversity specifically, and SI more generally, as well as the theories of the intonational contours involved.</p>
<p>The finding that scales vary in their likelihood of receiving an RFR contour in production has implications for scalar diversity. Namely, it raises the possibility that scales similarly vary in whether they are silently read with RFR in written studies. As a result, when comparing SI rates across different lexical scales using written stimuli, it is not easily discernible if differences are driven by the lexical scales themselves or mediated through the effect of lexical scales on rates of intonational contours, which then affect the likelihood of SI. At the same time, the by-scale correlation between RFR productions and written study-based SI rates is only moderate (<xref ref-type="fig" rid="F3">Figure 3</xref>), and the relative ranking of scales remains reasonably consistent across different intonational contours (see 4.5). This cautions against interpreting our results too strongly, as suggesting that intonation being masked in prior studies is such a serious confound that existing scalar diversity findings should, necessarily, be reassessed. Instead, it seems likely that the interaction of scalar terms with intonation, e.g., their propensity to be silently read with the RFR, is one among many other factors &#8211; such as semantic distance or boundedness, as discussed in 2.1 &#8211; that play a role in scalar diversity.</p>
<p>As things stand, it is not well understood precisely what factors matter for the felicity of a contour such as the RFR, nor is the observed variation in SI calculation fully explained. But, based on existing proposals, some properties of the linguistic signal matter for both. We have investigated one such factor, the polarity of adjectival scales, following up on the possibility that negative scales only lead to higher rates of SI than positive scales due to a difference in their compatibility with the RFR. While we ultimately found this not to be the case in our own data set, such possibilities should be taken into account in future work. More generally, based on the results of our article (as well as prior work, such as <xref ref-type="bibr" rid="B24">Gotzner, 2019</xref>; <xref ref-type="bibr" rid="B55">Tomlinson et al., 2017</xref>), future studies of SI should ideally control for the effects of intonation. As mentioned, we do not fully understand what governs the use of the RFR, and our work has found it to increase the rate of SI calculation. Consequently, written studies can never fully rule out the possibility that a participant&#8217;s SI judgment had been affected by projecting an RFR contour onto the stimuli.</p>
<p>Turning to the implications of the results for accounts of the RFR, there were three relevant sources of evidence. First, the RFR was almost exclusively produced in Experiment 1 when the question prompt contained a stronger alternative to the scalar term in the target sentence. Of the accounts discussed in 2.3, this pattern is most straightforwardly explained by the account of G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>). On their view, the RFR presupposes the presence of a stronger alternative, which is provided in the <sc>strong</sc> condition, but not the <sc>same</sc> condition, thus explicitly licensing its use. The increased production rate could then be attributed to a principle like Maximize Presupposition (<xref ref-type="bibr" rid="B29">Heim, 1991</xref>), which encourages speakers to use a presupposition trigger &#8211; in this case the intonational contour &#8211; whenever possible. The accounts of Constant (<xref ref-type="bibr" rid="B10">2012</xref>) and Wagner et al. (<xref ref-type="bibr" rid="B59">2013</xref>) capture these results, insofar as they rule out the use of the RFR in the <sc>same</sc> condition, either due to no alternatives being left open (Constant) or the reply providing a complete answer (Wagner et al.). However, these accounts would have to be augmented by a general theory of why people choose to express one meaning over another when there are multiple options, since neither is couched in terms of presupposition, and hence a principle like Maximize Presupposition does not apply.</p>
<p>Westera (<xref ref-type="bibr" rid="B62">2019</xref>), in turn, could capture the observed <sc>strong-same</sc> difference in RFR productions via the assumption that finding a secondary QUD is easier in the <sc>strong</sc> condition. Such an assumption seems plausible, given that the <sc>same</sc> condition is maximally restricted in how the reply relates to the question, while the lexical mismatch in the <sc>strong</sc> condition leaves it more open whether the question is sufficiently addressed, potentially leading participants to search for other questions that could be addressed instead, or in addition. In contrast to previous accounts, on Ward and Hirschberg&#8217;s (<xref ref-type="bibr" rid="B60">1985</xref>) account, it is less clear how to explain the production patterns; given the uncertainty view, there is no obvious reason why there should be an asymmetry between our two conditions. Finally, the RFR being restricted to the <sc>strong</sc> condition is least compatible with Wagner (<xref ref-type="bibr" rid="B58">2012</xref>), since on this account, the RFR is treated as quantifying over alternative speech acts, rather than being restricted by focus. As such, it is not clear why one could not indicate something else that could have been said in the <sc>same</sc> condition, leaving it unexplained why the two conditions should differ.</p>
<p>Secondly, and most importantly, the experimental results showed that the RFR led to an increase in SI rate, relative to a Fall. This finding provides direct evidence against the accounts of Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), Wagner (<xref ref-type="bibr" rid="B58">2012</xref>) and Wagner et al. (<xref ref-type="bibr" rid="B59">2013</xref>), all of which predict a decrease instead. Constant (<xref ref-type="bibr" rid="B10">2012</xref>) and Westera (<xref ref-type="bibr" rid="B62">2019</xref>), on the other hand, are able to capture our empirical findings, since these accounts are, in principle, compatible with both uncertainty and strengthening. Finally, similarly to the data from production rates, the SI rate pattern is accounted for by G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>) as an effect of salience: by virtue of the RFR presupposing the presence of a stronger alternative, this alternative can be assumed to be more salient, which, in turn, facilitates drawing the SI. This reasoning is analogous to a salience-based explanation of the finding &#8211; replicated here &#8211; that mentioning the stronger alternative in a preceding question in a dialogue also increases SI rate.</p>
<p>It is also worth briefly addressing the third source of evidence for accounts of the RFR, namely, the variation in production rate across lexical scales. As discussed in relation to <xref ref-type="fig" rid="F3">Figure 3</xref>, the RFR occurred more than twice as often on lexical scales of positive polarity, compared to scales of negative polarity. This pattern is uniquely in line with the accounts of G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>): on the view that the alternative presupposed by the RFR is not simply stronger, but higher on a scale, pairs like &lt;<italic>ugly, hideous</italic>&gt; would be expected to license an SI by virtue of one item being stronger, but not license the use of the RFR on the weaker item, given that <italic>ugly</italic> is a higher degree of beauty relative to <italic>hideous</italic> and hence higher on the scale. However, it is also clear that this account cannot capture the full range of observed variation across scales. While a more detailed discussion of how the various scale properties could relate to accounts of the RFR goes beyond the scope of this article, we believe that it serves as a promising source of evidence for future research.</p>
<p>Let us now turn to the question of how the effect of the RFR on SI rate in our experiments relates to prior studies (see also 2.4). Our findings are in line with de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>), who similarly found that the RFR increased SI rates. Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear</xref>), on the other hand, found that participants are more likely to pair the RFR with an ignorance inference interpretation, and a Fall with an SI, than the other way around. The authors took this finding to support an uncertainty view of the RFR. Given such an account, and, indeed, Buccola and Goodhue&#8217;s empirical results, we would expect the RFR to decrease SI rates, which is the opposite of what we found. Here, we suggest some ways in which this conflict can be reconciled.</p>
<p>Notably, Buccola and Goodhue&#8217;s (<xref ref-type="bibr" rid="B7">to appear</xref>) experiments tested SIs based on only the &lt;<italic>some, all</italic>&gt; scale, while de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>) and the current article focused on a larger number of lexical scales. In fact, in our Experiment 2, the effect of intonation on the &lt;<italic>some, all</italic>&gt; scale is in line with Buccola and Goodhue: SI was calculated at a rate of 88.57% with Fall, and 73.33% with the RFR. Since this data point represents only one of our 60 items, it does not lend itself to statistical analysis, but the numerical trend observed is in the direction supported by Buccola and Goodhue&#8217;s work: the RFR led to fewer SIs. Taking the three relevant studies together, it is conceivable that the <italic>some but not all</italic> SI is affected differently by intonation than other scales. While it is perhaps the paradigmatic example of SI, there are other ways in which it is not representative of the entire class of lexical scales: for example, it leads to SI calculation more robustly than almost any other scale.</p>
<p>Another important difference between our experiments and Buccola and Goodhue&#8217;s is that they specifically contrasted SIs with ignorance inferences: both potential meanings were explicitly made available to participants. In contrast, the inference task in our own experiments is primarily aimed at identifying when participants have calculated an SI (&#8220;Yes&#8221; response), but obscures some other possible interpretations. As noted by Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>), given our dialogue manipulation, three different meanings may underlie Luke&#8217;s answer <italic>She was happy</italic>. It could correspond to an SI-enriched meaning (23a), or an ignorance meaning (23b), where Luke can only say that the winner was happy, but he does not know whether she was ecstatic, or to a meaning where the weaker term <italic>happy</italic> is used as a (near)-synonym to <italic>ecstatic</italic> (23c).</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(23)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>Emma: Was the winner ecstatic?</p></list-item>
<list-item><p>Luke: She was happy.</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>a.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>She was happy (but not ecstatic).&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;SI</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>b.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>(Well,) she was happy.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;ignorance</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>&#160;</p></list-item>
</list>
<list list-type="wordfirst">
<list-item><p>c.</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>(Yes,) she was happy.&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;happy &#8776; ecstatic</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Making the same observation that there are these three possible meanings, Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear, p. 11</xref>) argue that the RFR is intuitively only compatible with (23a) and (23b), while a Fall is only compatible with (23a) and (23c). They further reason that if participants frequently intended to produce meanings like (23c) in our Experiment 1, and used Fall to do so, that could explain why not many of the Fall productions corresponded to SI calculation. This, coupled with a bias for SI over ignorance inferences affecting the interpretations of the RFR, could have led to the illusion that RFR leads to more SIs than Fall. However, this only explains the findings of our Experiment 1, not our Experiment 2. Moreover, if Fall is indeed inappropriate for conveying a meaning like (23b), that could have influenced the authors&#8217; own experiment. Namely, participants may have chosen to pair an ignorance inference with the RFR simply to avoid matching it to a contour it is incompatible with (Fall). We, thus, take the overall evidence to be a challenge for uncertainty accounts of the RFR, and more in favor of accounts such as G&#246;bel (<xref ref-type="bibr" rid="B20">2019</xref>) and G&#246;bel and Wagner (<xref ref-type="bibr" rid="B21">2023</xref>).</p>
<p>Lastly, the relevance of the present study for research on the meaning of intonational contours goes beyond theories of RFR, namely, by revealing the frequent use of the Concession Contour. In terms of its distribution, we found that the Concession Contour was produced more in the <sc>same</sc> condition, but also present in the <sc>strong</sc> condition. Additionally, in both experiments, the Concession Contour, like the RFR, contributed to an SI rate increase, relative to a Fall, although to a lesser extent. Given that our main focus is on SIs and the finding of the Concession Contour is unexpected, we believe a full account of the Concession Contour goes beyond the scope of this article. One point we would like to address, however, is how the Concession Contour might relate to the recognized category of the Contradiction Contour, and, specifically, if they should be treated as variants of each other or as distinct categories.</p>
<p>As mentioned in 3.5, the Concession Contour and the Contradiction Contour seem closely related prosodically: both contours have an initial &#8220;floating&#8221; high tone that is not aligned with lexical stress, as well as a final rise. One notable difference is that the pitch height for the Contradiction Contour seems exaggerated. However, this difference could be attributed to para-linguistic factors, such as emotional arousal. The exaggerated contour of the Contradiction Contour is, thus, not a conclusive argument against treating it as a variant of the Concession Contour (and vice versa).</p>
<p>On the semantic-pragmatic side, the Contradiction Contour has been argued to presuppose that there is contextual evidence for the complement of the prejacent proposition (<xref ref-type="bibr" rid="B23">Goodhue &amp; Wagner, 2018</xref>). That is, in the stereotypical contradictory use, the Contradiction Contour on <italic>p</italic> is licensed by the prior assertion of &#172;<italic>p</italic> (and vice versa). However, the question is if this account could capture the use of the contour in our experimental conditions, i.e., in response to a question, and its effect on SI rate. One possibility could be to adjust the account to conceive of contextual evidence in terms of degrees (cf. Farkas &amp; Roelofsen&#8217;s (<xref ref-type="bibr" rid="B17">2017</xref>) notion of credence levels). Given that asking about a proposition <italic>p</italic> (= <italic>?p</italic>) is usually only licensed when the speaker is not committed to either <italic>p</italic> or &#172;<italic>p</italic>, there is a sense in which the question act raises doubt about <italic>p</italic> and, hence, provides contextual evidence &#8211; albeit weak &#8211; against <italic>p</italic> being true, licensing the Contradiction Contour in the guise of what we labeled Concession Contour. This adjustment could, then, easily account for the use of the contour in the <sc>same</sc> condition, and also in the <sc>strong</sc> condition, on the additional assumption that asking about a stronger alternative implies doubt about a weaker alternative as well, although to a lesser degree, in line with the numerical difference in production rates. The account could even allow one to integrate the observation about the difference between Contradiction Contour and Concession Contour regarding pitch exaggeration, by correlating pitch height with the degree of contextual evidence: pitch will be higher overall in a contradictory use, since asserting <italic>p</italic> constitutes the maximal amount of contextual evidence against &#172;<italic>p</italic>, whereas pitch is reduced in reply to a question, because &#172;<italic>p</italic> is, by definition, still at least a possibility.</p>
<p>However, it is unclear how this account could explain the increase in SI rate with the Concession Contour found in the experiments. Using a contour to communicate that there is evidence against <italic>p</italic> (for instance, <italic>The winner was happy</italic>) is, in principle, independent of the attitude the speaker has toward its strengthened interpretation. Moreover, and maybe more crucially, viewing the use of the Concession Contour in terms of contextual evidence fails to take into account properties of the lexical scales. But as <xref ref-type="fig" rid="F6">Figure 6</xref> shows, there was substantial variation observed in production rates of the Concession Contour across scales &#8211; as with the case of the RFR. We, thus, conclude that a unified account of Concession Contour and Contradiction Contour in terms of gradient contextual evidence faces serious challenges, and leave a more in-depth investigation into this issue and alternative possibilities for future research.</p>
<fig id="F6">
<caption>
<p><bold>Figure 6:</bold> Production rates for Concession Contour by item in Experiment 1.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="glossapx-3-1-4911-g6.png"/>
</fig>
</sec>
<sec>
<title>6. Conclusion</title>
<p>Recent research in experimental pragmatics has focused on the phenomenon of scalar diversity: the inter-scale non-uniformity of SI. While most existing studies in this domain have relied on written stimuli, the present article begins to probe the interplay of intonation and SI calculation across different scales. In two experiments, we tested the effect of different contours (either produced by participants or directly manipulated in the stimuli) on the robustness of SI calculation. We found that SI rates varied by contour, and, in particular, the so-called RFR contour made SIs more likely to arise. The production rate of RFR also varied across lexical scales. These findings point to the importance of considering intonation in studies of SI and scalar diversity. Further, they also inform theoretical treatments of the RFR, being most easily captured by accounts that link its felicity to the presence of a stronger alternative.</p>
</sec>
</body>
<back>
<fn-group>
<fn id="n1"><p>We follow Ladd (<xref ref-type="bibr" rid="B38">2008</xref>) in defining <italic>intonation</italic> as suprasegmental phonetic features ranging over sentences in a linguistically structured way.</p></fn>
<fn id="n2"><p>Notably, while all languages featured here can be argued to have broadly similar functions for pitch accents &#8211; the part of the overall contour perceived as prominent and often marked by a change in pitch &#8211; their intonational systems also differ slightly. The discussion should, therefore, not be taken to imply that the phonetic-phonological details of pitch accents are identical across these languages.</p></fn>
<fn id="n3"><p>The label for this accent type is part of the widely adopted ToBI annotation system (<xref ref-type="bibr" rid="B5">Beckman et al., 2005</xref>), derived from the autosegmental-metrical (AM) theory of intonation (<xref ref-type="bibr" rid="B42">Pierrehumbert, 1980</xref>). On this approach, a sentence-level contour consists of a sequence of low (L) and high (H) pitch targets of different accent types (pitch accent, phrasal accent, boundary accent/tone), with the &#8216;*&#8217; indicating prominence of pitch accents. In the literature presented here and adjacent to it, the L+H* accent is often taken to convey contrastive Focus, but terminology is not always defined and phonetic details may vary or are often missing. For the purposes of this paper, we define <italic>focus</italic> as a semantic-pragmatic correlate of (at least some) pitch accents that evokes alternatives (<xref ref-type="bibr" rid="B36">Krifka, 2008</xref>; <xref ref-type="bibr" rid="B48">Rooth, 1992</xref>).</p></fn>
<fn id="n4"><p>Note that this judgment is only meant to hold in the absence of further context. For instance, in a situation where A is intentionally looking for an ugly gift for a person they do not like, B&#8217;s reply seems quite acceptable.</p></fn>
<fn id="n5"><p>In the case of Ward and Hirschberg (<xref ref-type="bibr" rid="B60">1985</xref>), there is an open question about what level the uncertainty could be conveyed at, given the different options provided (8a)&#8211;(8c). Following de Marneffe and Tonhauser (<xref ref-type="bibr" rid="B13">2019</xref>), we will assume that the most sensible option is one where uncertainty relates to the choice of scalar value rather than the existence or type of scale, given that the target items in studies of SI are inherently scalar.</p></fn>
<fn id="n6"><p>Buccola and Goodhue (<xref ref-type="bibr" rid="B7">to appear</xref>) and Ronai and Xiang (<xref ref-type="bibr" rid="B47">2024</xref>) argue that a &#8220;No&#8221; response is also compatible with ignorance regarding the status of the stronger alternative. We come back to this issue in Section 5.</p></fn>
<fn id="n7"><p>While this latter criterion leads to a high exclusion rate, we consider it justified, since we were not interested in how often people use the RFR in general, but in <italic>when</italic> and <italic>how</italic> they use it. We attribute the large number of exclusions mainly to the online setting; participants were, essentially, asked to simulate a natural-sounding conversation while sitting by themselves in front of a computer. As a result, even though all experimental items were dialogues, many participants&#8217; productions resembled reading a passage out loud from a book instead of participation in a conversation.</p></fn>
<fn id="n8"><p>Note that while individual circles in <xref ref-type="fig" rid="F2">Figure 2</xref> correspond to the proportion of &#8220;Yes&#8221; responses per participant in that condition, these proportions also depend on how many times that participant produced the given contour. That is, if a participant only produced the RFR on one item, then their proportion of &#8220;Yes&#8221; responses could only be 100% or 0%. This problem will not arise in Experiment 2, when we directly manipulate contours.</p></fn>
<fn id="n9"><p>The same correlation is much weaker in the <sc>same</sc> condition (Pearson&#8217;s correlation test: r = 0.24), which follows from the RFR not being produced very robustly in the <sc>same</sc> condition in the first place.</p></fn>
<fn id="n10"><p>This would seem to run counter to Gotzner et al.&#8217;s findings, but, in fact, if we look at the same 22 scales from their work that were tested in our study, we find that those also did not produce different rates of SI; in the relevant subset of Gotzner et al.&#8217;s data, the average SI rate for positive scales is 37.17%, while for negative ones, it is 37.1%.</p></fn>
<fn id="n11"><p>A statistic of &#8211;1 indicates full reversal of the rankings, while a statistic of 1 indicates the same ranking, i.e., that lexical scales occur in the same order when ranked by the SI rate they produced.</p></fn>
</fn-group>
<sec>
<title>Data accessibility statement</title>
<p>Experimental files, results files and analysis code are stored in an OSF repository at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://osf.io/6a9wg/?view_only=e14a3c89b1474918b30c1d59c202fbff">https://osf.io/6a9wg/?view_only=e14a3c89b1474918b30c1d59c202fbff</ext-link>.</p>
</sec>
<sec>
<title>Ethics and consent</title>
<p>The studies reported in this paper were approved by the Princeton Institutional Review Board (#15015). Informed consent was obtained from all participants.</p>
</sec>
<sec>
<title>Acknowledgements</title>
<p>We are indebted to Emma Nguyen and Luke Adamson for providing audio stimuli, to Thomas Sostarics for help with the visualizations, as well as to Dan Goodhue, Sunwoo Jeong, Deniz Rudin, Michael Wagner, the UPenn Experimental Semantics Lab, the SALT 33 audience, and three anonymous reviewers for feedback. This material is partially based upon work supported by the National Science Foundation under Grant No. #BCS-2041312.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<sec>
<title>Authors&#8217; contributions</title>
<p>The authors contributed equally to this work and are listed in reverse alphabetical order.</p>
</sec>
<sec>
<title>ORCiD IDs</title>
<p>Eszter Ronai: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://orcid.org/0000-0003-1578-0938">https://orcid.org/0000-0003-1578-0938</ext-link></p>
<p>Alexander G&#246;bel: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://orcid.org/0000-0002-7920-9071">https://orcid.org/0000-0002-7920-9071</ext-link></p>
</sec>
<ref-list>
<ref id="B1"><label>1</label><mixed-citation publication-type="book"><string-name><surname>Bader</surname>, <given-names>M.</given-names></string-name> (<year>1998</year>). <chapter-title>Prosodic influences on reading syntactically ambiguous sentences</chapter-title>. In <string-name><given-names>J. D.</given-names> <surname>Fodor</surname></string-name> &amp; <string-name><given-names>F.</given-names> <surname>Ferreira</surname></string-name> (Eds.), <source>Reanalysis in sentence processing</source> (pp. <fpage>1</fpage>&#8211;<lpage>46</lpage>). <publisher-name>Kluwer</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1007/978-94-015-9070-9_1</pub-id></mixed-citation></ref>
<ref id="B2"><label>2</label><mixed-citation publication-type="journal"><string-name><surname>Baker</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Doran</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>McNabb</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Larson</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Ward</surname>, <given-names>G.</given-names></string-name> (<year>2009</year>). <article-title>On the non-unified nature of scalar implicature: An empirical investigation</article-title>. <source>International Review of Pragmatics</source>, <volume>1</volume>(<issue>2</issue>), <fpage>211</fpage>&#8211;<lpage>248</lpage>. DOI: <pub-id pub-id-type="doi">10.1163/187730909X12538045489854</pub-id></mixed-citation></ref>
<ref id="B3"><label>3</label><mixed-citation publication-type="journal"><string-name><surname>Barr</surname>, <given-names>D. J.</given-names></string-name>, <string-name><surname>Levy</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Scheepers</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Tily</surname>, <given-names>H. J.</given-names></string-name> (<year>2013</year>). <article-title>Random effects structure for confirmatory hypothesis testing: Keep it maximal</article-title>. <source>Journal of Memory and Language</source>, <volume>68</volume>(<issue>3</issue>), <fpage>255</fpage>&#8211;<lpage>278</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2012.11.001</pub-id></mixed-citation></ref>
<ref id="B4"><label>4</label><mixed-citation publication-type="journal"><string-name><surname>Bates</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>M&#228;chler</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Bolker</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Walker</surname>, <given-names>S.</given-names></string-name> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using lme4</article-title>. <source>Journal of Statistical Software</source>, <volume>67</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>48</lpage>. DOI: <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></mixed-citation></ref>
<ref id="B5"><label>5</label><mixed-citation publication-type="book"><string-name><surname>Beckman</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Hirschberg</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Shattuck-Hufnagel</surname>, <given-names>S.</given-names></string-name> (<year>2005</year>). <chapter-title>The original ToBI system and the evolution of the ToBI framework</chapter-title>. In <string-name><given-names>S.-A.</given-names> <surname>Jun</surname></string-name> (Ed.), <source>Prosodic typology: The phonology of intonation and phrasing</source> (pp. <fpage>9</fpage>&#8211;<lpage>54</lpage>). <publisher-name>Oxford University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1093/acprof:oso/9780199249633.003.0002</pub-id></mixed-citation></ref>
<ref id="B6"><label>6</label><mixed-citation publication-type="journal"><string-name><surname>Beltrama</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Xiang</surname>, <given-names>M.</given-names></string-name> (<year>2013</year>). <article-title>Is &#8216;good&#8217; better than &#8216;excellent&#8217;? An experimental investigation on scalar implicatures and gradable adjectives</article-title>. In <string-name><given-names>E.</given-names> <surname>Chemla</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Homer</surname></string-name>, &amp; <string-name><given-names>G.</given-names> <surname>Winterstein</surname></string-name> (Eds.), <source>Proceedings of Sinn und Bedeutung</source> <volume>17</volume> (pp. <fpage>81</fpage>&#8211;<lpage>98</lpage>).</mixed-citation></ref>
<ref id="B7"><label>7</label><mixed-citation publication-type="webpage"><string-name><surname>Buccola</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Goodhue</surname>, <given-names>D.</given-names></string-name> (to appear). <article-title>The effect of intonation on scalar and ignorance inferences</article-title>. <source>Proceedings of the 59th Annual Meeting of the Chicago Linguistic Society (CLS 59)</source>, <uri>https://ling.auf.net/lingbuzz/007464</uri>.</mixed-citation></ref>
<ref id="B8"><label>8</label><mixed-citation publication-type="journal"><string-name><surname>B&#252;ring</surname>, <given-names>D.</given-names></string-name> (<year>2003</year>). <article-title>On d-trees, beans, and b-accents</article-title>. <source>Linguistics and Philosophy</source>, <volume>26</volume>, <fpage>511</fpage>&#8211;<lpage>545</lpage>. DOI: <pub-id pub-id-type="doi">10.1023/A:1025887707652</pub-id></mixed-citation></ref>
<ref id="B9"><label>9</label><mixed-citation publication-type="journal"><string-name><surname>Chevallier</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Noveck</surname>, <given-names>I. A.</given-names></string-name>, <string-name><surname>Nazir</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Bott</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Lanzetti</surname>, <given-names>V.</given-names></string-name>, &amp; <string-name><surname>Sperber</surname>, <given-names>D.</given-names></string-name> (<year>2008</year>). <article-title>Making disjunctions exclusive</article-title>. <source>Quarterly Journal of Experimental Psychology</source>, <volume>61</volume>(<issue>11</issue>), <fpage>1741</fpage>&#8211;<lpage>1760</lpage>. DOI: <pub-id pub-id-type="doi">10.1080/17470210701712960</pub-id></mixed-citation></ref>
<ref id="B10"><label>10</label><mixed-citation publication-type="journal"><string-name><surname>Constant</surname>, <given-names>N.</given-names></string-name> (<year>2012</year>). <article-title>English rise-fall-rise: A study in the semantics and pragmatics of intonation</article-title>. <source>Linguistics and Philosophy</source>, <volume>35</volume>, <fpage>407</fpage>&#8211;<lpage>442</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s10988-012-9121-1</pub-id></mixed-citation></ref>
<ref id="B11"><label>11</label><mixed-citation publication-type="book"><string-name><surname>Cummins</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Katsos</surname>, <given-names>N.</given-names></string-name> (<year>2019</year>). <chapter-title>1. Introduction</chapter-title>. In <source>The Oxford handbook of experimental semantics and pragmatics</source>. <publisher-name>Oxford University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1093/oxfordhb/9780198791768.013.33</pub-id></mixed-citation></ref>
<ref id="B12"><label>12</label><mixed-citation publication-type="journal"><string-name><surname>Cummins</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Rohde</surname>, <given-names>H.</given-names></string-name> (<year>2015</year>). <article-title>Evoking context with contrastive stress: Effects on pragmatic enrichment</article-title>. <source>Frontiers in Psychology</source>, <volume>6</volume>, <elocation-id>1779</elocation-id>. DOI: <pub-id pub-id-type="doi">10.3389/fpsyg.2015.01779</pub-id></mixed-citation></ref>
<ref id="B13"><label>13</label><mixed-citation publication-type="book"><string-name><surname>de Marneffe</surname>, <given-names>M.-C.</given-names></string-name>, &amp; <string-name><surname>Tonhauser</surname>, <given-names>J.</given-names></string-name> (<year>2019</year>). <chapter-title>Inferring meaning from indirect answers to polar questions: The contribution of the rise-fall-rise contour</chapter-title>. In <string-name><given-names>E.</given-names> <surname>Onea</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Zimmermann</surname></string-name>, &amp; <string-name><given-names>K.</given-names> <surname>von Heusinger</surname></string-name> (Eds.), <source>Questions in discourse</source> (pp. <fpage>132</fpage>&#8211;<lpage>163</lpage>). <publisher-name>Brill</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1163/9789004378322_006</pub-id></mixed-citation></ref>
<ref id="B14"><label>14</label><mixed-citation publication-type="thesis"><string-name><surname>Degen</surname>, <given-names>J.</given-names></string-name> (<year>2013</year>). <source>Alternatives in pragmatic reasoning</source> [Doctoral dissertation, <publisher-name>University of Rochester</publisher-name>].</mixed-citation></ref>
<ref id="B15"><label>15</label><mixed-citation publication-type="journal"><string-name><surname>Degen</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Tanenhaus</surname>, <given-names>M. K.</given-names></string-name> (<year>2015</year>). <article-title>Processing scalar implicature: A constraint-based approach</article-title>. <source>Cognitive Science</source>, <volume>39</volume>(<issue>4</issue>), <fpage>667</fpage>&#8211;<lpage>710</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/cogs.12171</pub-id></mixed-citation></ref>
<ref id="B16"><label>16</label><mixed-citation publication-type="journal"><string-name><surname>Doran</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Ward</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Larson</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>McNabb</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Baker</surname>, <given-names>R. E.</given-names></string-name> (<year>2012</year>). <article-title>A novel experimental paradigm for distinguishing between what is said and what is implicated</article-title>. <source>Language</source>, <volume>88</volume>(<issue>1</issue>), <fpage>124</fpage>&#8211;<lpage>154</lpage>. DOI: <pub-id pub-id-type="doi">10.1353/lan.2012.0008</pub-id></mixed-citation></ref>
<ref id="B17"><label>17</label><mixed-citation publication-type="journal"><string-name><surname>Farkas</surname>, <given-names>D. F.</given-names></string-name>, &amp; <string-name><surname>Roelofsen</surname>, <given-names>F.</given-names></string-name> (<year>2017</year>). <article-title>Division of labor in the interpretation of declaratives and interrogatives</article-title>. <source>Journal of Semantics</source>, <volume>34</volume>, <fpage>237</fpage>&#8211;<lpage>289</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/jos/ffw012</pub-id></mixed-citation></ref>
<ref id="B18"><label>18</label><mixed-citation publication-type="journal"><string-name><surname>Fodor</surname>, <given-names>J. D.</given-names></string-name> (<year>2002</year>). <article-title>Prosodic disambiguation in silent reading</article-title>. <source>Proceedings of NELS</source>, <volume>32</volume>, <fpage>113</fpage>&#8211;<lpage>137</lpage>.</mixed-citation></ref>
<ref id="B19"><label>19</label><mixed-citation publication-type="book"><string-name><surname>Frazier</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Gibson</surname>, <given-names>E.</given-names></string-name> (<year>2015</year>). <source>Explicit and implicit prosody in sentence processing: Studies in honor of Janet Dean Fodor</source>. <publisher-name>Springer</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1007/978-3-319-12961-7</pub-id></mixed-citation></ref>
<ref id="B20"><label>20</label><mixed-citation publication-type="journal"><string-name><surname>G&#246;bel</surname>, <given-names>A.</given-names></string-name> (<year>2019</year>). <article-title>Additives pitching in: L*+h signals ordered focus alternatives</article-title>. <source>Proceedings of SALT 29</source>, <fpage>279</fpage>&#8211;<lpage>299</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/salt.v29i0.4612</pub-id></mixed-citation></ref>
<ref id="B21"><label>21</label><mixed-citation publication-type="journal"><string-name><surname>G&#246;bel</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Wagner</surname>, <given-names>M.</given-names></string-name> (<year>2023</year>). <article-title>On a concessive reading of the rise-fall-rise contour: Contextual and semantic factors</article-title>. <source>Proceedings of ELM</source> <volume>2</volume>, <fpage>83</fpage>&#8211;<lpage>94</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/elm.2.5395</pub-id></mixed-citation></ref>
<ref id="B22"><label>22</label><mixed-citation publication-type="journal"><string-name><surname>Goodhue</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Harrison</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Su</surname>, <given-names>Y. T. C.</given-names></string-name>, &amp; <string-name><surname>Wagner</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <article-title>Toward a bestiary of English intonational contours</article-title>. In <string-name><given-names>B.</given-names> <surname>Prickett</surname></string-name> &amp; <string-name><given-names>C.</given-names> <surname>Hammerly</surname></string-name> (Eds.), <source>Proceedings of the North East Linguistics Society</source> <volume>46</volume> (pp. <fpage>311</fpage>&#8211;<lpage>320</lpage>).</mixed-citation></ref>
<ref id="B23"><label>23</label><mixed-citation publication-type="journal"><string-name><surname>Goodhue</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Wagner</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>Intonation, &#8216;yes&#8217; and &#8216;no&#8217;</article-title>. <source>Glossa: A Journal of General Linguistics</source>, <volume>3</volume>, <fpage>1</fpage>&#8211;<lpage>45</lpage>. DOI: <pub-id pub-id-type="doi">10.5334/gjgl.210</pub-id></mixed-citation></ref>
<ref id="B24"><label>24</label><mixed-citation publication-type="journal"><string-name><surname>Gotzner</surname>, <given-names>N.</given-names></string-name> (<year>2019</year>). <article-title>The role of focus intonation in implicature computation: A comparison with &#8216;only&#8217; and &#8216;also&#8217;</article-title>. <source>Natural Language Semantics</source>, <volume>27</volume>, <fpage>189</fpage>&#8211;<lpage>226</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s11050-019-09154-7</pub-id></mixed-citation></ref>
<ref id="B25"><label>25</label><mixed-citation publication-type="journal"><string-name><surname>Gotzner</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Solt</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Benz</surname>, <given-names>A.</given-names></string-name> (<year>2018</year>). <article-title>Scalar diversity, negative strengthening, and adjectival semantics</article-title>. <source>Frontiers in Psychology</source>, <volume>9</volume>, <elocation-id>1659</elocation-id>. DOI: <pub-id pub-id-type="doi">10.3389/fpsyg.2018.01659</pub-id></mixed-citation></ref>
<ref id="B26"><label>26</label><mixed-citation publication-type="book"><string-name><surname>Grice</surname>, <given-names>H. P.</given-names></string-name> (<year>1967</year>). <chapter-title>Logic and conversation</chapter-title>. In <string-name><given-names>P.</given-names> <surname>Grice</surname></string-name> (Ed.), <source>Studies in the way of words</source> (pp. <fpage>41</fpage>&#8211;<lpage>58</lpage>). <publisher-name>Harvard University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1163/9789004368811_003</pub-id></mixed-citation></ref>
<ref id="B27"><label>27</label><mixed-citation publication-type="journal"><string-name><surname>Gualmini</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Hulsey</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Hacquard</surname>, <given-names>V.</given-names></string-name>, &amp; <string-name><surname>Fox</surname>, <given-names>D.</given-names></string-name> (<year>2008</year>). <article-title>The Question-Answer Requirement for scope assignment</article-title>. <source>Natural Language Semantics</source>, <volume>16</volume>(<issue>3</issue>), <fpage>205</fpage>&#8211;<lpage>237</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s11050-008-9029-z</pub-id></mixed-citation></ref>
<ref id="B28"><label>28</label><mixed-citation publication-type="thesis"><string-name><surname>Gunlogson</surname>, <given-names>C.</given-names></string-name> (<year>2001</year>). <source>True to form: Rising and falling declaratives as questions in English</source> [Doctoral dissertation, <publisher-name>University of California, Santa Cruz</publisher-name>].</mixed-citation></ref>
<ref id="B29"><label>29</label><mixed-citation publication-type="book"><string-name><surname>Heim</surname>, <given-names>I.</given-names></string-name> (<year>1991</year>). <chapter-title>Artikel und Definitheit</chapter-title>. In <string-name><given-names>A. v.</given-names> <surname>Stechow</surname></string-name> &amp; <string-name><given-names>D.</given-names> <surname>Wunderlich</surname></string-name> (Eds.), <source>Handbuch der Semantik</source> (pp. <fpage>487</fpage>&#8211;<lpage>535</lpage>). <publisher-name>de Gruyter</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1515/9783110126969.7.487</pub-id></mixed-citation></ref>
<ref id="B30"><label>30</label><mixed-citation publication-type="thesis"><string-name><surname>Hirschberg</surname>, <given-names>J. B.</given-names></string-name> (<year>1985</year>). <source>A theory of scalar implicature</source> [Doctoral dissertation, <publisher-name>University of Pennsylvania</publisher-name>].</mixed-citation></ref>
<ref id="B31"><label>31</label><mixed-citation publication-type="book"><string-name><surname>H&#246;hle</surname>, <given-names>T. N.</given-names></string-name> (<year>1992</year>). <chapter-title>&#220;ber Verum-Fokus im Deutschen</chapter-title>. In <string-name><given-names>J.</given-names> <surname>Jacobs</surname></string-name> (Ed.), <source>Informationsstruktur und Grammatik</source> (pp. <fpage>112</fpage>&#8211;<lpage>141</lpage>). <publisher-name>Opladen</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1007/978-3-663-12176-3_5</pub-id></mixed-citation></ref>
<ref id="B32"><label>32</label><mixed-citation publication-type="thesis"><string-name><surname>Horn</surname>, <given-names>L. R.</given-names></string-name> (<year>1972</year>). <source>On the semantic properties of logical operators in English</source> [Doctoral dissertation, <publisher-name>UCLA</publisher-name>].</mixed-citation></ref>
<ref id="B33"><label>33</label><mixed-citation publication-type="journal"><string-name><surname>Hu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Levy</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Degen</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Schuster</surname>, <given-names>S.</given-names></string-name> (<year>2023</year>). <article-title>Expectations over unspoken alternatives predict pragmatic inferences</article-title>. <source>Transactions of the Association for Computational Linguistics</source>, <volume>11</volume>, <fpage>885</fpage>&#8211;<lpage>901</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/tacl_a_00579</pub-id></mixed-citation></ref>
<ref id="B34"><label>34</label><mixed-citation publication-type="journal"><string-name><surname>Hu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Levy</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Schuster</surname>, <given-names>S.</given-names></string-name> (<year>2022</year>). <article-title>Predicting scalar diversity with context-driven uncertainty over alternatives</article-title>. <source>Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics</source>, <fpage>68</fpage>&#8211;<lpage>74</lpage>. DOI: <pub-id pub-id-type="doi">10.18653/v1/2022.cmcl-1.8</pub-id></mixed-citation></ref>
<ref id="B35"><label>35</label><mixed-citation publication-type="book"><string-name><surname>Hulsey</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Hacquard</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Fox</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Gualmini</surname>, <given-names>A.</given-names></string-name> (<year>2004</year>). <chapter-title>The Question-Answer Requirement and scope assignment</chapter-title>. In <string-name><given-names>A.</given-names> <surname>Csirmaz</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Gualmini</surname></string-name>, &amp; <string-name><given-names>A.</given-names> <surname>Nevins</surname></string-name> (Eds.), <source>MIT working papers in linguistics</source> (pp. <fpage>71</fpage>&#8211;<lpage>90</lpage>). <publisher-name>MITWPL</publisher-name>.</mixed-citation></ref>
<ref id="B36"><label>36</label><mixed-citation publication-type="journal"><string-name><surname>Krifka</surname>, <given-names>M.</given-names></string-name> (<year>2008</year>). <article-title>Basic notions of information structure</article-title>. <source>Acta Linguistica Hungarica</source>, <volume>55</volume>, <fpage>243</fpage>&#8211;<lpage>276</lpage>. DOI: <pub-id pub-id-type="doi">10.1556/ALing.55.2008.3-4.2</pub-id></mixed-citation></ref>
<ref id="B37"><label>37</label><mixed-citation publication-type="book"><string-name><surname>Kursat</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Degen</surname>, <given-names>J.</given-names></string-name> (<year>2020</year>). <chapter-title>Probability and processing speed of scalar inferences is context-dependent</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Denison</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Mack</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Xu</surname></string-name>, &amp; <string-name><given-names>B. C.</given-names> <surname>Armstrong</surname></string-name> (Eds.), <source>Proceedings of the 42nd Annual Conference of the Cognitive Science Society</source> (pp. <fpage>1236</fpage>&#8211;<lpage>1242</lpage>). <publisher-name>Cognitive Science Society</publisher-name>.</mixed-citation></ref>
<ref id="B38"><label>38</label><mixed-citation publication-type="book"><string-name><surname>Ladd</surname>, <given-names>R. D.</given-names></string-name> (<year>2008</year>). <source>Intonational phonology</source>. <publisher-name>Cambridge University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1017/CBO9780511808814</pub-id></mixed-citation></ref>
<ref id="B39"><label>39</label><mixed-citation publication-type="journal"><string-name><surname>Liberman</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Sag</surname>, <given-names>I.</given-names></string-name> (<year>1974</year>). <article-title>Prosodic form and discourse function</article-title>. <source>Proceedings of CLS</source>, <volume>10</volume>, <fpage>416</fpage>&#8211;<lpage>427</lpage>.</mixed-citation></ref>
<ref id="B40"><label>40</label><mixed-citation publication-type="book"><string-name><surname>Lohnstein</surname>, <given-names>H.</given-names></string-name> (<year>2015</year>). <chapter-title>Verum focus</chapter-title>. In <string-name><given-names>C.</given-names> <surname>F&#233;ry</surname></string-name> &amp; <string-name><given-names>S.</given-names> <surname>Ishihara</surname></string-name> (Eds.), <source>The Oxford handbook of information structure</source> (pp. <fpage>290</fpage>&#8211;<lpage>313</lpage>). <publisher-name>Oxford Academic</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1093/oxfordhb/9780199642670.013.33</pub-id></mixed-citation></ref>
<ref id="B41"><label>41</label><mixed-citation publication-type="journal"><string-name><surname>Pankratz</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>van Tiel</surname>, <given-names>B.</given-names></string-name> (<year>2021</year>). <article-title>The role of relevance for scalar diversity: A usage-based approach</article-title>. <source>Language and Cognition</source>, <volume>13</volume>(<issue>4</issue>), <fpage>562</fpage>&#8211;<lpage>594</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/langcog.2021.13</pub-id></mixed-citation></ref>
<ref id="B42"><label>42</label><mixed-citation publication-type="thesis"><string-name><surname>Pierrehumbert</surname>, <given-names>J. B.</given-names></string-name> (<year>1980</year>). <source>The phonology and phonetics of English intonation</source> [Doctoral dissertation, <publisher-name>MIT</publisher-name>].</mixed-citation></ref>
<ref id="B43"><label>43</label><mixed-citation publication-type="journal"><string-name><surname>Roberts</surname>, <given-names>C.</given-names></string-name> (<year>2012</year>). <article-title>Information structure in discourse: Towards an integrated formal theory of pragmatics [Earlier version appeared in OSU Working Papers in Linguistics 49 in 1996]</article-title>. <source>Semantics and Pragmatics</source>, <volume>5</volume>, <fpage>1</fpage>&#8211;<lpage>69</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/sp.5.6</pub-id></mixed-citation></ref>
<ref id="B44"><label>44</label><mixed-citation publication-type="journal"><string-name><surname>Ronai</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Xiang</surname>, <given-names>M.</given-names></string-name> (<year>2021a</year>). <article-title>Exploring the connection between Question Under Discussion and scalar diversity</article-title>. <source>Proceedings of the Linguistic Society of America</source>, <volume>6</volume>(<issue>1</issue>), <fpage>649</fpage>&#8211;<lpage>662</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/plsa.v6i1.5001</pub-id></mixed-citation></ref>
<ref id="B45"><label>45</label><mixed-citation publication-type="journal"><string-name><surname>Ronai</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Xiang</surname>, <given-names>M.</given-names></string-name> (<year>2021b</year>). <article-title>Pragmatic inferences are QUD-sensitive: An experimental study</article-title>. <source>Journal of Linguistics</source>, <volume>57</volume>(<issue>4</issue>), <fpage>841</fpage>&#8211;<lpage>870</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0022226720000389</pub-id></mixed-citation></ref>
<ref id="B46"><label>46</label><mixed-citation publication-type="journal"><string-name><surname>Ronai</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Xiang</surname>, <given-names>M.</given-names></string-name> (<year>2022</year>). <article-title>Three factors in explaining scalar diversity</article-title>. <source>Proceedings of Sinn und Bedeutung</source>, <volume>26</volume>, <fpage>716</fpage>&#8211;<lpage>733</lpage>.</mixed-citation></ref>
<ref id="B47"><label>47</label><mixed-citation publication-type="journal"><string-name><surname>Ronai</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Xiang</surname>, <given-names>M.</given-names></string-name> (<year>2024</year>). <article-title>What could have been said? Alternatives and variability in pragmatic inferences</article-title>. <source>Journal of Memory and Language</source>, <volume>136</volume>, <elocation-id>104507</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2024.104507</pub-id></mixed-citation></ref>
<ref id="B48"><label>48</label><mixed-citation publication-type="journal"><string-name><surname>Rooth</surname>, <given-names>M.</given-names></string-name> (<year>1992</year>). <article-title>A theory of focus interpretation</article-title>. <source>Natural Language Semantics</source>, <volume>1</volume>, <fpage>75</fpage>&#8211;<lpage>116</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/BF02342617</pub-id></mixed-citation></ref>
<ref id="B49"><label>49</label><mixed-citation publication-type="webpage"><string-name><surname>Sandberg</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Cole</surname>, <given-names>J.</given-names></string-name> (<year>2022</year>). <source>The role of duration in signaling scalar alternative sets</source> [Poster presented at the 9th Experimental Pragmatics Conference]. <uri>https://sites.northwestern.edu/katesandberg/files/2022/09/XPrag-Pres_final.pptx</uri></mixed-citation></ref>
<ref id="B50"><label>50</label><mixed-citation publication-type="journal"><string-name><surname>Schwarz</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Clifton</surname>, <given-names>C.</given-names>, <suffix>Jr.</suffix></string-name>, &amp; <string-name><surname>Frazier</surname>, <given-names>L.</given-names></string-name> (<year>2007</year>). <article-title>Strengthening &#8216;or&#8217;: Effects of focus and downward entailing contexts on scalar implicatures</article-title>. <source>University of Massachusetts Occasional Papers in Linguistics</source>, <volume>33</volume>(<issue>1</issue>), <fpage>9</fpage>.</mixed-citation></ref>
<ref id="B51"><label>51</label><mixed-citation publication-type="journal"><string-name><surname>Simons</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Warren</surname>, <given-names>T.</given-names></string-name> (<year>2018</year>). <article-title>A closer look at strengthened readings of scalars</article-title>. <source>Quarterly Journal of Experimental Psychology</source>, <volume>71</volume>(<issue>1</issue>), <fpage>272</fpage>&#8211;<lpage>279</lpage>. DOI: <pub-id pub-id-type="doi">10.1080/17470218.2017.1314516</pub-id></mixed-citation></ref>
<ref id="B52"><label>52</label><mixed-citation publication-type="journal"><string-name><surname>Solt</surname>, <given-names>S.</given-names></string-name> (<year>2015</year>). <article-title>Measurement scales in natural language</article-title>. <source>Language and Linguistics Compass</source>, <volume>9</volume>, <fpage>14</fpage>&#8211;<lpage>32</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/lnc3.12101</pub-id></mixed-citation></ref>
<ref id="B53"><label>53</label><mixed-citation publication-type="journal"><string-name><surname>Sun</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Tian</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Breheny</surname>, <given-names>R.</given-names></string-name> (<year>2018</year>). <article-title>A link between local enrichment and scalar diversity</article-title>. <source>Frontiers in Psychology</source>, <volume>9</volume>. DOI: <pub-id pub-id-type="doi">10.3389/fpsyg.2018.02092</pub-id></mixed-citation></ref>
<ref id="B54"><label>54</label><mixed-citation publication-type="journal"><string-name><surname>Sun</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Tian</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Breheny</surname>, <given-names>R.</given-names></string-name> (<year>2023</year>). <article-title>A corpus-based examination of scalar diversity</article-title>. <source>Journal of Experimental Psychology: Learning, Memory, and Cognition</source>. DOI: <pub-id pub-id-type="doi">10.1037/xlm0001278</pub-id></mixed-citation></ref>
<ref id="B55"><label>55</label><mixed-citation publication-type="journal"><string-name><surname>Tomlinson</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Gotzner</surname>, <given-names>N.</given-names></string-name>, &amp; <string-name><surname>Bott</surname>, <given-names>L.</given-names></string-name> (<year>2017</year>). <article-title>Intonation and pragmatic enrichment: How intonation constrains ad hoc scalar inferences [PMID: 28697695]</article-title>. <source>Language and Speech</source>, <volume>60</volume>(<issue>2</issue>), <fpage>200</fpage>&#8211;<lpage>223</lpage>. DOI: <pub-id pub-id-type="doi">10.1177/0023830917716101</pub-id></mixed-citation></ref>
<ref id="B56"><label>56</label><mixed-citation publication-type="journal"><string-name><surname>Tomlinson</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Ronderos</surname>, <given-names>C. R.</given-names></string-name> (<year>2021</year>). <article-title>Does intonation automatically strengthen scalar implicatures?</article-title> <source>Semantics and Pragmatics</source>, <volume>14</volume>(<issue>4</issue>), <fpage>1</fpage>&#8211;<lpage>30</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/sp.14.4</pub-id></mixed-citation></ref>
<ref id="B57"><label>57</label><mixed-citation publication-type="journal"><string-name><surname>van Tiel</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Van Miltenburg</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Zevakhina</surname>, <given-names>N.</given-names></string-name>, &amp; <string-name><surname>Geurts</surname>, <given-names>B.</given-names></string-name> (<year>2016</year>). <article-title>Scalar diversity</article-title>. <source>Journal of Semantics</source>, <volume>33</volume>(<issue>1</issue>), <fpage>137</fpage>&#8211;<lpage>175</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/jos/ffu017</pub-id></mixed-citation></ref>
<ref id="B58"><label>58</label><mixed-citation publication-type="journal"><string-name><surname>Wagner</surname>, <given-names>M.</given-names></string-name> (<year>2012</year>). <article-title>Contrastive topics decomposed</article-title>. <source>Semantics and Pragmatics</source>, <volume>5</volume>, <fpage>1</fpage>&#8211;<lpage>54</lpage>. DOI: <pub-id pub-id-type="doi">10.3765/sp.5.8</pub-id></mixed-citation></ref>
<ref id="B59"><label>59</label><mixed-citation publication-type="book"><string-name><surname>Wagner</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>McClay</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Mak</surname>, <given-names>L.</given-names></string-name> (<year>2013</year>). <chapter-title>Incomplete answers and the rise-fall-rise contour</chapter-title>. In <string-name><given-names>R.</given-names> <surname>Fern&#225;ndez</surname></string-name> &amp; <string-name><given-names>A.</given-names> <surname>Isard</surname></string-name> (Eds.), <source>Proceedings of the 17th workshop on the semantics and pragmatics of dialogue</source> (pp. <fpage>140</fpage>&#8211;<lpage>149</lpage>). <publisher-name>SEMDIAL</publisher-name>.</mixed-citation></ref>
<ref id="B60"><label>60</label><mixed-citation publication-type="journal"><string-name><surname>Ward</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name><surname>Hirschberg</surname>, <given-names>J.</given-names></string-name> (<year>1985</year>). <article-title>Implicating uncertainty: The pragmatics of fall-rise intonation</article-title>. <source>Language</source>, <volume>61</volume>, <fpage>747</fpage>&#8211;<lpage>776</lpage>. DOI: <pub-id pub-id-type="doi">10.2307/414489</pub-id></mixed-citation></ref>
<ref id="B61"><label>61</label><mixed-citation publication-type="thesis"><string-name><surname>Westera</surname>, <given-names>M.</given-names></string-name> (<year>2017</year>). <source>Exhaustivity and intonation: A unified theory</source> [Doctoral dissertation, <publisher-name>University of Amsterdam</publisher-name>].</mixed-citation></ref>
<ref id="B62"><label>62</label><mixed-citation publication-type="book"><string-name><surname>Westera</surname>, <given-names>M.</given-names></string-name> (<year>2019</year>). <chapter-title>Rise-fall-rise as a marker of secondary QUDs</chapter-title>. In <string-name><given-names>D.</given-names> <surname>Gutzmann</surname></string-name> &amp; <string-name><given-names>K.</given-names> <surname>Turgay</surname></string-name> (Eds.), <source>Secondary content: The linguistics of side issues</source> (pp. <fpage>376</fpage>&#8211;<lpage>404</lpage>). <publisher-name>Brill</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1163/9789004393127_015</pub-id></mixed-citation></ref>
<ref id="B63"><label>63</label><mixed-citation publication-type="journal"><string-name><surname>Westera</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Boleda</surname>, <given-names>G.</given-names></string-name> (<year>2020</year>). <article-title>A closer look at scalar diversity using contextualized semantic similarity</article-title>. <source>Proceedings of Sinn und Bedeutung</source>, <volume>24</volume>(<issue>2</issue>), <fpage>439</fpage>&#8211;<lpage>454</lpage>. DOI: <pub-id pub-id-type="doi">10.18148/sub/2020.v24i2.908</pub-id></mixed-citation></ref>
<ref id="B64"><label>64</label><mixed-citation publication-type="journal"><string-name><surname>Yang</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Minai</surname>, <given-names>U.</given-names></string-name>, &amp; <string-name><surname>Fiorentino</surname>, <given-names>R.</given-names></string-name> (<year>2018</year>). <article-title>Context-sensitivity and individual differences in the derivation of scalar implicature</article-title>. <source>Frontiers in Psychology</source>, <volume>9</volume>, <elocation-id>1720</elocation-id>. DOI: <pub-id pub-id-type="doi">10.3389/fpsyg.2018.01720</pub-id></mixed-citation></ref>
<ref id="B65"><label>65</label><mixed-citation publication-type="thesis"><string-name><surname>Zondervan</surname>, <given-names>A.</given-names></string-name> (<year>2010</year>). <source>Scalar implicatures or focus: An experimental approach</source> [Doctoral dissertation, <publisher-name>University of Amsterdam</publisher-name>].</mixed-citation></ref>
<ref id="B66"><label>66</label><mixed-citation publication-type="book"><string-name><surname>Zondervan</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Meroni</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Gualmini</surname>, <given-names>A.</given-names></string-name> (<year>2008</year>). <chapter-title>Experiments on the role of the question under discussion for ambiguity resolution and implicature computation in adults</chapter-title>. In <string-name><given-names>T.</given-names> <surname>Friedman</surname></string-name> &amp; <string-name><given-names>S.</given-names> <surname>Ito</surname></string-name> (Eds.), <source>Proceedings of SALT</source> <volume>28</volume> (pp. <fpage>765</fpage>&#8211;<lpage>777</lpage>). <publisher-name>Cornell University</publisher-name>. DOI: <pub-id pub-id-type="doi">10.3765/salt.v18i0.2486</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>