Joshua HollaAI Researcher.
https://joshholla.github.io/
Off-Policy++\[\newcommand{\e}{\epsilon}
\newcommand{\y}{\gamma}
\newcommand{\al}{\alpha}
\newcommand{\s}{\sigma}
\newcommand{\ta}{\theta}
\newcommand{\w}{\omega}
\newcommand{\g}{\nabla}
\newcommand{\E}{\mathbb{E}}
\newcommand{\N}{\mathcal{N}}
\newcommand{\lb}{\left [}
\newcommand{\rb}{\right ]}
\newcommand{\lp}{\left (}
\newcommand{\rp}{\right )}
\newcommand{\lv}{\left \Vert}
\newcommand{\rv}{\right \Vert}
\newcommand{\la}{\left |}
\newcommand{\ra}{\right |}
\newcommand{\B}{\mathcal{B}}
\newcommand{\Loss}{\mathcal{L}}\]
<p>This is a short primer on Off-Dynamics Reinforcement Learning. I also talk a little bit about density ratio estimation.</p>
<h3 id="what-is-this-off-dynamics-reinforcement-learning-you-speak-of">What is this <code class="language-plaintext highlighter-rouge">Off-Dynamics Reinforcement Learning</code> you speak of?</h3>
<p>The Off-Policy<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote">1</a></sup> setting is cool, but let us consider operating in an environment where it is really hard to collect data. Suppose that collecting data is really expensive, or takes too long. It becomes rather hard to solve this setting using traditional Off-Policy methods.
In such settings, your friendly neighborhood engineer is going to suggest training inexpensively on a simulator.
That’s a great idea! And like all great ideas, it has a few drawbacks.
For one, I doubt we’re ever going to have perfect simulators. The dream of running behaviour policies in simulation to obtain good enough target policies is likely to remain just that.<br />
Unless we solve this problem with a little something called Off-Dynamics Reinforcement Learning.</p>
<p>Let us assume that we will never be able to perfectly model the real world dynamics for tasks that we care about.
This forces us to assume two Markov Decision Processes (MDPs)<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote">2</a></sup> in the treatment of this setting:
\(\mathcal{M}_{\text{source}}\) represents the source domain, which is the simulator, practice domain or the approximate model of the target domain. \(\mathcal{M}_{\text{target}}\) is the target domain.<br />
We assume that both these MDPs have the same state space \(\mathcal{S}\), action space \(\mathcal{A}\), reward function \(r\) and initial state distribution \(p_1(s_1)\).
The only difference between the domains is their dynamics. The dynamics are represented by \(p_{\text{target}} (s_{t+1} \vert s_t, a_t)\) and \(p_{\text{source}} (s_{t+1} \vert s_t, a_t)\).
We also make the assumption of coverage:</p>
\[\begin{align*}
p_{\text{target}} (s_{t+1} \vert s_t, a_t) > 0 \implies p_{\text{source}} (s_{t+1} \vert s_t, a_t) >0 \forall s_t, s_{t+1} \in \mathcal(S), a_t \in \mathcal{A}
\end{align*}\]
<p>The core objective is that we’d like to learn Markovian policy \(\pi_{\ta} (a \vert s)\) that maximises the expected discounted sum of rewards on \(\mathcal{M}_{\text{target}}\):</p>
\[\begin{align*}
\E_{\pi_{\ta}, \mathcal{M}_{\text{target}}} \lb \sum_t \y^t r(s_t, a_t) \rb
\end{align*}\]
<p>We’d like to achieve this objective using mostly inexpensive interactions in the source MDP (\(\mathcal{M}_{\text{source}}\))
and a small number of interactions in the target MDP \(\mathcal{M}_{\text{target}}\).<br />
Our final policy should obtain near optimal policies in the target MDP, \(\mathcal{M}_{\text{target}}\). We don’t really care for optimal policies in the source MDP(s).</p>
<p>A recent paper (Eysenbach, et al. 2020)<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote">3</a></sup> did an excellent job of attacking learning in this setting. They called it Off-Dynamcics Reinforcement Learning, and I like that name.
It’s a really cool paper, and I highly recommend reading it.<br />
This post is going to walk through a few bits from this paper that I found interesting, and I’ll try to share some of the gotchas that leapt out at me after a lot of staring at a whiteboard and more than a little muttering.</p>
<hr />
<h3 id="achieving-our-objective">Achieving our objective:</h3>
<p>Consider the probabilistic inference interpretation of RL (Levine 2018)<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote">4</a></sup>. Here, we view the reward function as a desired distribution over trajectories.
The agent samples from this distribution of trajectories by picking trajectories with probability proportional to their exponentiated reward.</p>
<p>For our treatment, let us define \(p(\tau)\) to be the desired distribution over trajectories in the target domain:</p>
\[\begin{align*}
p(\tau) \propto p_1(s_1) \lp \prod_t p_{\text{target}}(s_{t+1} \vert s_t, a_t) \rp \exp \lp \sum_t r(s_t, a_t) \rp
\end{align*}\]
<p>We’d like our policy \(\pi_\ta\) to pick the best trajectories ( maximize the expected reward ) in this distribution.</p>
<p>Now consider our agent’s distribution over trajectories in the source domain. Let’s call it \(q(\tau)\):</p>
\[\begin{align*}
q(\tau) = p_1 (s_1) \prod_t p_{\text{source}} (s_{t+1} \vert s_t, a_t ) \pi_\theta (a_t \vert s_t )
\end{align*}\]
<p>\(q(\tau)\) is parameterized by \(\ta\).</p>
<p>Minimizing the reverse KL divergence<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote">5</a></sup> between these two distributions will lead to achieving the objective we set up.</p>
\[\begin{align*}
D_{\text{KL}} (q(\tau) \vert \vert p(\tau) ) = &- \underset{\tau \sim q(\tau)}{\E} \lb \log p(\tau) - \log q(\tau) \rb \\
= &- \underset{\tau \sim q(\tau)}{\E} \bigg [ \log p_1(s_1) + \sum_{t=1}^T \log p_{\text{target}} (s_{t+1} \vert s_t, a_t) + \sum_t r(s_t, a_t) \\
&- \log p_1(s_1) - \sum_{t=1}^T \log p_{\text{source}} (s_{t+1} \vert s_t, a_t) - \log \pi_\ta (a_t \vert s_t) \bigg ] \\
= &- \underset{\tau \sim q(\tau)}{\E}\bigg [ \sum_t r(s_t, a_t) - \log \pi_\ta (a_t \vert s_t) + \sum_{t=1}^T \log p_{\text{target}} (s_{t+1} \vert s_t, a_t) \\
&- \sum_{t=1}^T \log p_{\text{source}} (s_{t+1} \vert s_t, a_t) \bigg ]\\
= &- \underset{q}{\E} \lb \sum_t r(s_t, a_t) + \mathcal{H}_{\pi}[a_t \vert s_t] + \Delta r (s_{t+1} , s_t, a_t ) \rb
\end{align*}\]
\[\begin{align*}
\underset{\pi (a\vert s), q(s'\vert s,a)}{\min} D_{\text{KL}} (q \vert \vert p) = - \E_q \lb \sum_t r(s_t, a_t) + \mathcal{H}[a_t \vert s_t] + \Delta r (s_{t+1} , s_t, a_t ) \rb + c
\end{align*}\]
<p>Where \(\mathcal{H}_\pi[a_t \vert s_t] = - \log \pi_{\ta} (a_t \vert s_t)\) and</p>
\[\begin{align*}
\Delta r (s_{t+1} , s_t, a_t ) = \log p_{\text{target}} (s_{t+1} \vert s_t, a_t) - \log p_{\text{source}} (s_{t+1} \vert s_t, a_t)
\end{align*}\]
<p>If there are no differences in dynamics, \(\Delta r=0\).</p>
<p>Looks good eh?<br />
No. The whole point of not being able to have perfect simulators, was that it’s really really hard to learn the transition functions.<br />
This \(\Delta r\) term needs some working on…</p>
<hr />
<h3 id="dealing-with-the-delta-r-term">Dealing with the \(\Delta r\) term</h3>
<p>If we cannot calculate a term exactly, the second best thing we can do is look for a good estimate for this term. Let’s try doing that with our \(\Delta r\) term. On expanding it, we get a fraction of intractable terms: \(\frac{p_{\text{target}} (s_{t+1} \vert s_t, a_t) }{p_{\text{source}} (s_{t+1} \vert s_t, a_t) }\).</p>
<p>Hang on. We can view \(\frac{p_{\text{target}} (s_{t+1} \vert s_t, a_t) }{p_{\text{source}} (s_{t+1} \vert s_t, a_t) }\) as a density ratio<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote">6</a></sup>.
And there are some cool ways to estimate density ratios using classifiers<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote">7</a></sup>. Fortunately we’ve gotten REALLY good at classifying things in the last few years.</p>
<p>In this case, we can use two classifiers looking at data from our
replay buffers. One classifier is going to look at tuples of the form \(\langle s_t, a_t, s_{t+1} \rangle\) and the other is going to look at tuples of the form \(\langle s_t, a_t \rangle\)</p>
<p>Now the ratio that we’d like to estimate is given by:</p>
\[\begin{align*}
\Delta r &= \log \lb \frac{p_{\text{target}} (s_{t+1} \vert s_t, a_t)}{p_{\text{source}} (s_{t+1} \vert s_t, a_t)} \rb
\end{align*}\]
<p>Let us pause to define our terms a little more clearly:</p>
<p>\(p_{\text{target}}(s_{t+1} \vert s_t, a_t)\):
this is the environment transition probability in the ‘target’ environment.
It can also be written as \(p(s_{t+1}\vert s_t, a_t, \text{target})\). This random variable tells us that if we consider \(\mathcal{M}_{\text{target}}\) and our agent takes action \(a_t\) while in state \(s_t\), with probability \(p_{\text{target}}(s_{t+1} \vert s_t, a_t)\) our agent will change its state to \(s_{t+1}\).</p>
<p>\(p(\text{target} \vert s_t, a_t, s_{t+1})\): this is the continuous random variable that represents the output of a binary classifier that has been fed the tuple \(\langle s_t, a_t, s_{t+1} \rangle\)<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote">8</a></sup></p>
<p>Okay, let us focus a little more on the conditional probability arising from our binary classifier that takes the tuple \(\langle s_t, a_t, s_{t+1} \rangle\)<br />
Bayes rule tells us that:</p>
\[\begin{align*}
p(\text{target} \vert s_t, a_t, s_{t+1} ) = \frac{p(s_t, a_t, s_{t+1} \vert \text{target} ) \times p(\text{target}) }{p(s_t, a_t, s_{t+1})}
\end{align*}\]
<p>Now, let us take a closer look at the \(p(s_t, a_t, s_{t+1} \vert \text{target})\) term in the RL setting:<br />
Given a state \(s_t\) and action \(a_t\) the probability of getting \(s_{t+1}\) depends on the environment transition probability. Since the label tells us that we are in the target MDP, we consider \(p_{\text{target}}(s_{t+1} \vert s_t, a_t)\)</p>
\[\begin{align*}
p(s_t, a_t, s_{t+1} \vert \text{target}) = p(s_{t+1} \vert s_t, a_t, \text{target}) \times p(s_t, a_t \vert \text{target})
\end{align*}\]
<p>Where \(p(s_t, a_t \vert \text{target})\) is the probability of the state-action pair \((s_t, a_t)\) occurring in the target MDP.<br />
Once we make this substitution, we get:</p>
\[\begin{align*}
p(\text{target} \vert s_t , a_t, s_{t+1}) &= \frac{p(s_{t+1} \vert s_t, a_t, \text{target}) p(s_t, a_t \vert \text{target}) p(\text{target} )}{p(s_t,a_t,s_{t+1})} \\
&= \frac{p_{\text{target}} (s_{t+1} \vert s_t, a_t) p(s_t, a_t \vert \text{target}) p(\text{target} )}{p(s_t,a_t,s_{t+1})}
\end{align*}\]
<p>This implies that we can make the substitution:</p>
\[\begin{align*}
p_{\text{target}} (s_{t+1} \vert s_t, a_t) = \frac{p(\text{target} \vert s_t , a_t, s_{t+1}) p(s_t,a_t,s_{t+1})}{ p(s_t, a_t \vert \text{target}) p(\text{target} )}
\end{align*}\]
<p>Now when it comes to the \(p(s_t, a_t \vert target)\) term, remember that we have a second classifier that takes in the tuple \(\langle s_t, a_t \rangle\). The output of that classifier is given by \(p(\text{target} \vert s_t, a_t)\).
Applying Bayes rule here tells us that:</p>
\[\begin{align*}
p(s_t,a_t \vert \text{target}) = \frac{p (\text{target} \vert s_t, a_t) p(s_t, a_t)}{p(\text{target})}
\end{align*}\]
<p>We’re now ready to deal with the \(\Delta r\) term that we wanted to estimate:</p>
\[\begin{align*}
\Delta r &= \log \lb \frac{p_{\text{target}} (s_{t+1} \vert s_t, a_t)}{p_{\text{source}} (s_{t+1} \vert s_t, a_t)} \rb \\
&= \log \lb \frac{p(\text{target}\vert s_t, a_t, s_{t+1}) p(s_t,a_t,s_{t+1}) }{ p(s_t, a_t \vert \text{target}) p(\text{target} ) } \times \frac{ p(s_t, a_t \vert \text{source}) p(\text{source} ) }{p(\text{source}\vert s_t, a_t, s_{t+1}) p(s_t,a_t,s_{t+1}) } \rb \\
&= \log \lb \frac{p(\text{target}\vert s_t, a_t, s_{t+1}) }{ p(s_t, a_t \vert \text{target}) p(\text{target} ) } \times \frac{ p(s_t, a_t \vert \text{source}) p(\text{source} ) }{p(\text{source}\vert s_t, a_t, s_{t+1}) } \rb \\
&= \log \lb \frac{p(\text{target}\vert s_t, a_t, s_{t+1}) p(\text{target} ) }{ p(\text{target} \vert s_t, a_t ) p(s_t, a_t) p(\text{target} ) } \times \frac{ p(\text{source} \vert s_t, a_t) p(s_t, a_t) p(\text{source} ) }{p(\text{source}\vert s_t, a_t, s_{t+1}) p(\text{source} ) } \rb\\
&= \log \lb \frac{p(\text{target}\vert s_t, a_t, s_{t+1}) }{ p(\text{target} \vert s_t, a_t ) } \times \frac{ p(\text{source} \vert s_t, a_t) ) }{p(\text{source}\vert s_t, a_t, s_{t+1}) } \rb\\
&= \log p (\text{target} \vert s_t, a_t, s_{t+1} ) - \log p (\text{target} \vert s_t, a_t) - \log p (\text{source} \vert s_t, a_t, s_{t+1} ) + \log p (\text{source} \vert s_t, a_t)
\end{align*}\]
<p>Hang on! <br />
Did we just get an estimate for \(\Delta r\) that depends solely on the predictions of our two classifiers?</p>
<p>\(\Delta r (s_t, a_t, s_{t+1}) =\)<span style="color:red"> \(\log p (\text{target} \vert s_t, a_t, s_{t+1} )\)</span> \(-\) <span style="color:blue">\(\log p (\text{target} \vert s_t, a_t)\)</span> \(-\) <span style="color:red"> \(\log p (\text{source} \vert s_t, a_t, s_{t+1} )\) </span> \(+\) <span style="color:blue">\(\log p (\text{source} \vert s_t, a_t)\)</span></p>
<p>We did!<br />
The <span style="color:red">\(\text{red}\)</span> terms are the difference in logits from the classifier conditioned on \(\langle s_t, a_t, s_{t+1} \rangle\) while the <span style="color:blue">\(\text{blue}\)</span> terms are the difference in logits from the classifier conditioned on just \(\langle s_t, a_t \rangle\).</p>
<p>This means that we can estimate the ratio of our transition functions, given enough samples from both environments, taken from their replay buffers.<br />
And I think that is absolutely NEAT-O!</p>
<p>Equipped with this substitution for \(\Delta r\), Off-Dynamics Reinforcement Learning should be a cinch.
It also opens up the possibility of learning from different MDPs. Think cheap robot test beds!<br />
Like I said before, Neat-o.</p>
<hr />
<h2 id="references-and-footnotes">References and Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:4" role="doc-endnote">
<p>There are a few distinctions to be made within Reinforcement Learning methods. When addressing the problem of Exporation vs Exploitation, one neat technique is off-policy reinforcement learning, where exploration of the states and action spaces are carried out by a policy called the behaviour policy, while the policy we really care about is obtained greedily, and called our target policy. Sutton and Barto is your friend. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Refer to <a href="https://www.amazon.ca/Reinforcement-Learning-Introduction-Richard-Sutton/dp/0262039249/">Reinforcement Learning, second edition: An Introduction</a> for a great introduction to all things RL if you aren’t familiar with any terms in this section. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:1" role="doc-endnote">
<p>Benjamin Eysenbach et al. <a href="https://arxiv.org/abs/2006.13916">“Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers”</a> ICML BIG Workshop <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Sergey Levine <a href="https://arxiv.org/abs/1805.00909">Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review</a> <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>KL divergence can be tricky. The following post helped me a lot, and is an excellent resource: <a href="https://dibyaghosh.com/blog/probability/kldivergence.html">Dibya Ghosh on KL divergence in Machine Learning</a> <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>A density ratio is a cool way to compare probabilities. Probabilities when left to themselves are often bland and uninteresting. However, comparing probabilities lets us form judgements. Think about it. When you want to compare two numbers, we look either at their difference, or their ratio. The same goes for comparing probability densities. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>Machine Learning Trick of the Day(7): Density Ratio Trick <a href="http://blog.shakirm.com/2018/01/machine-learning-trick-of-the-day-7-density-ratio-trick/">Shakir Mohamed’s excellent blog</a> <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>A simple way of thinking about ‘target’ or ‘source’ when seen as an event, would be to imagine a label associated with the tuple which indicates the MDP the tuple was pulled from (\(y=\)target). The notation used here drops the ‘\(y=\)’ part. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Thu, 01 Oct 2020 00:00:00 +0000
https://joshholla.github.io/blog/2020/10/01/offDyn/
https://joshholla.github.io/blog/2020/10/01/offDyn/A Geometric Approach to Music.<p>I wanted to share a cool geometric frame<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote">1</a></sup> on music that I came across while trying to coax nicer sounds out of my guitar.</p>
<h2 id="some-basics">Some basics</h2>
<p>Some people say that music is the space between notes.<br />
They might have a point, but frustratingly that doesn’t help me sound more interesting ( I went 7 years between notes once - and I would not reccommend it ).<br />
Let’s talk about chords for a bit.</p>
<blockquote>
<p>Turns out sounding two notes the same time is all you need to do to call yourself a chord these days. That being said, the distance between these notes makes a difference.</p>
</blockquote>
<p>The distance between two notes is called an interval. Because us guitarists are a highly industrious bunch, we’ve used numbers to describe these intervals and can describe chords using formulae.</p>
<p>First things first. The way I’ve understood it, a piece of music is in a key. This usually clues us in on what note sounds most stable in this context - our Root note. Let’s consider the key of C and the C major scale, as there are no accidentals ( sharps or flats ). This <code class="language-plaintext highlighter-rouge">scale</code> nonsense that I’ve just introduced is a subset of all available notes. It follows the following intervals - W W H W W W H - where W is a whole step (the 2nd in interval-speak) and H is a half step (the flat 2nd b2).</p>
<p>Now our notes are sitting between these intervals. So you’d have the following notes and intervals on the C major scale:</p>
<blockquote>
<p>C D E F G A B C<br />
R 2 3 4 5 6 7 R</p>
</blockquote>
<p>Now any good guitar instructor will tell you that the spelling of a major chord is <code class="language-plaintext highlighter-rouge">R 3 5</code><br />
Armed with this <code class="language-plaintext highlighter-rouge">triad</code>, we can jump into this fancy new perspective that I’ve been enjoying lately:</p>
<h2 id="enter-shapes-and-sounds">Enter Shapes and Sounds:</h2>
<p>It starts with imagining that all the chords and sounds that you’re hearing exist on an non-euclidean space.<br />
Consider each note to be a point in this sound-space if you will. When we play a chord, and (let’s talk about three note chords for now, a triad) we’re forming a shape in this space for the duration of that chord.<br />
Each note exists in a fixed distance from one another. That distance is the interval.<br />
So if we take our <code class="language-plaintext highlighter-rouge">R 3 5</code> we get a neat little triangle. The triangle moves around every time we play a different major chord ( different roots ). Cool huh?<br />
Well, what would you do with a physical object? If you’re anything like me, you’d spin it around, or stretch the sides about. Turns out we can do the same thing in our musical space!<br />
Rotate the triangle on a vertex, so that a different vertex, side or face hits you first. You’ve just done something that we call inversions in music. Inversions of the same chord sound different, and evoke different feelings in the listener. (Think <code class="language-plaintext highlighter-rouge">5 R 3</code> with the 5 in the bass or <code class="language-plaintext highlighter-rouge">3 5 R</code>) This could be rather interesting to the discerning composer.<br />
You can also compose similar shapes with notes with the same intervals from different octaves.</p>
<p>Now what happens when we take a minor chord? The formula for that triad is <code class="language-plaintext highlighter-rouge">R b3 5</code> which means we play the flat third. This sounds VERY different. Our triangle is still a triangle, but the lenght of it’s sides have changed. Different triangles sound different huh?</p>
<p>What happens when we add other points? 7th Chords sound ‘Jazzy’ because of the different interval we’ve added to this basic triad. It’s now a different shape in our space. A polytope of sorts.</p>
<p>This was a neat frame - different shapes have different sounds, music can be ‘seen’ or ‘felt’ as a succession of different shapes.<br />
To the improvising guitarist, remember that intervals exist close together and far apart on your fretboard.<br />
Finding these different shapes and exploring spread intervals has been rather enjoyable, and I’m seeing shapes and colors in sounds that I didn’t know existed before.</p>
<p>This plays nicely with harmony as well. Imagine the rest of the band filling out different parts of your polytope at that point in time. You get to pick and choose the point you write, in turn expanding or shrinking or dramatically altering the shape and sound of the composition washing over any ears nearby.</p>
<p>Fun stuff!</p>
<p>Write to me on <a href="https://twitter.com/HollaAtJosh">twitter @HollaAtJosh</a> if you have any comments, can educate me further, or want to share something on the topic.</p>
<hr />
<h2 id="footnotes">Footnotes:</h2>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>My fascination with this interpretation may or may not have anything to do with the fact that I came across it while also taking graduate mathematics classes. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Tue, 29 Sep 2020 00:00:00 +0000
https://joshholla.github.io/blog/2020/09/29/geometry/
https://joshholla.github.io/blog/2020/09/29/geometry/Questions I like.<p>Language is powerful.</p>
<p>It appears that we think mostly in questions. I’ve curated a few that I find myself returning to often.
They improve the quality of my perception of life.</p>
<p>Perhaps you’ll find some of them enjoyable too:</p>
<blockquote>
<p>What did I learn today?</p>
</blockquote>
<blockquote>
<p>What am I enjoying most in my life right now?</p>
</blockquote>
<blockquote>
<p>What am I grateful about in my life now?</p>
</blockquote>
<blockquote>
<p>What is great about this problem?</p>
</blockquote>
<blockquote>
<p>What are the 20% of X that currently bring me 80% of my joy?</p>
</blockquote>
<blockquote>
<p>What is not perfect yet?<br />
How can I enjoy the process while I do what is necessary to make it the way I want it?</p>
</blockquote>
<p>I’d love to hear about any phrases that you’ve found useful!<br />
Write to me on <a href="https://twitter.com/HollaAtJosh">twitter @HollaAtJosh</a>.</p>
Wed, 23 Sep 2020 00:00:00 +0000
https://joshholla.github.io/blog/2020/09/23/questions/
https://joshholla.github.io/blog/2020/09/23/questions/