- [Voiceover] So I have written here the formal definition

for the partial derivative of

a two-variable function with respect to X, and what

I wanna do is build up to the formal definition

of the directional derivative of that same function

in the direction of some vector V, and you know,

V with the little thing on top,

this will be some vector in the input space, and I have

another video on the formal definition

of the partial derivative if you want to

check that out, and just to really quickly go through here,

I've drawn this diagram before,

but it's worth drawing again, if you

think of your input space, which is the X Y plane,

and you think of it somehow mapping over to

the real number line, which is where

your output F lives, and when you're taking

the partial derivative at a point A B, you're looking

over here and you say, maybe that's your point,

some point A B, and you imagine nudging it

slightly in the X direction, and saying,

hey, how does that influence the function?

So, maybe this is where A B lands, and maybe

the result is a nudge that's a little bit negative.

That would be a negative partial derivative, and you

think of the size of that nudge as partial X, and the size

of the resulting nudge in the output space as partial F.

So, the way that you read this formal definition

is you think of this variable H, you know, people,

you could say delta X, but H seems to be

the common variable people use, you think of it

as that change in your input space,

that slight nudge, and you look at how that influences

the function when you only change the X component here,

you know, you're only changing the X component

with that nudge, and you say what's the change in F?

What's that partial F?

So, I'm gonna write this in a slightly different way,

using vector notation.

Instead I'm gonna say, you know, partial F,

partial X, and instead of saying the input is A B,

I'm gonna say it's a, you know,

just A, and then make it clear that that's

a vector, and this will be a two-dimensional vector,

so I'll put that little arrow on top

to indicate that it's a vector, and if we rewrite

this definition, we'd be thinking

the limit, as H goes to zero,

of

something divided by H, but that thing,

now that we're writing in terms of

vector notation, is gonna be F of,

so it's gonna be our original starting point A,

but plus what?

I mean, up here, it was clear we could just add it

to the first component, but if I'm not writing in terms

of components, and I have to think in terms of

vector addition, really what I'm adding

is that H times the vector, the unit vector

in the X direction, and it's common

to use, you know, this little I with a hat

to represent the unit vector in the X direction.

So when I'm adding these, it's really the same.

You know, this H is only gonna go to

that first component, and the second component is multiplied

by zero, and what we subtract off

is the value of the function at

that original input, that original

two-dimensional input that I'm just thinking of

as a vector here, and when I write it like this,

it's actually much clearer how we might extend

this idea to moving in different directions.

'Cause now, all of the information

about what direction you're moving

is captured with this vector here,

what you multiply your nudge by as you're adding the input.

So let's just rewrite that over here

in the context of directional derivative.

What you would say is that the directional derivative

in the direction of some vector, any vector,

of F, evaluated at a point, and we'll think about

that input point as being a vector itself, A.

Here, I'll get rid of this guy.

It's also gonna be a limit, and as always,

with these things, we think of some,

not, I mean, always, but with derivatives,

you think of some variable as going to zero, and then

that's gonna be on the denominator, and the change

in the function that we're looking for

is gonna be F, evaluated at that initial input vector

plus H, that scaling value, that little nudge of a value,

multiplied by the vector whose direction

we care about, and then you subtract off

the value of F at that original input.

So, this right here is the formal definition

for the directional derivative, and you see how

it's much easier to write in vector notation,

because you're thinking of your input

as a vector and your output as just some nudge by something.

So, let's take a look at what that

would feel like over here.

You know, instead of thinking of D X and a nudge

purely in the X direction, and I'll erase these guys,

you would think of this point as being A,

as being a vector valued A, so

just to make clear how it's a vector,

you'd be thinking of it starting at

the origin, and the tip represents

that point, and then H times V, you know,

maybe V is some vector, often,

you know, a direction that's neither

purely X nor purely Y, but when you

scale it down, it'll just be a tiny little nudge

that's gonna be H, that tiny little value,

scaling your vector V, so that tiny little nudge, and what

you wonder is, hey, what's the resulting nudge

to the output?

And the ratio between the size of that

resulting nudge to the output and the original guy there

is your directional derivative, and more importantly,

as you take the limit for that original nudge

getting really really small, that's gonna be

your directional derivative, and you can probably anticipate

there's a way to interpret this as the slope of a graph.

That's what I'm gonna talk about next video,

but you actually have to be a little bit careful,

because we call this the directional derivative,

but notice, if you scale the value V by two,

you know, if you go over here and you start plugging in

two times V and seeing how that influences things,

it'll be twice the change, because here,

even if you're scaling by the same value H,

it's gonna double the initial nudge that you had, and it's

gonna double the resulting nudge out here,

even though the denominator H doesn't stay changed.

So when you're taking the ratio, what you're considering

is the size of your initial nudge

actually might be influenced.

So, some authors, they'll actually change

this definition, and they'll throw a little

absolute value of the original vector, just to make sure

that when you scale it by something else, it doesn't

influence things, and you only care about the direction.

But, I actually don't like that.

I think there's some usefulness in the definition

as it is right here, and that there's kind of

a good interpretation to be had,

for when, if you double the size of your vector,

why that should double the size of your derivative,

but I'll get to that in following videos.

This right here is the formal definition

to be thinking about, and I'll see you next video.